CN107908626A - The computational methods and device of company's similarity - Google Patents

The computational methods and device of company's similarity Download PDF

Info

Publication number
CN107908626A
CN107908626A CN201611265737.1A CN201611265737A CN107908626A CN 107908626 A CN107908626 A CN 107908626A CN 201611265737 A CN201611265737 A CN 201611265737A CN 107908626 A CN107908626 A CN 107908626A
Authority
CN
China
Prior art keywords
similarity
liang
companies
information
company
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611265737.1A
Other languages
Chinese (zh)
Inventor
于秋林
陈尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
OneConnect Smart Technology Co Ltd
Original Assignee
OneConnect Financial Technology Co Ltd Shanghai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by OneConnect Financial Technology Co Ltd Shanghai filed Critical OneConnect Financial Technology Co Ltd Shanghai
Priority to CN201611265737.1A priority Critical patent/CN107908626A/en
Publication of CN107908626A publication Critical patent/CN107908626A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of computational methods and device of company's similarity, this method includes:If receiving the similarity measure request to Liang Ge companies, according to child node position of the Liang Ge companies in the message structure tree of each information dimension pre-established, similarity of the Liang Ge companies under each information dimension is calculated respectively;Wherein, described information structure tree is the message structure tree with host node and child node that each predetermined information dimension is established according to the company information of acquisition to obtain company information from predetermined database;Overall similarity of the Liang Ge companies under all information dimensions is calculated according to similarity of the Liang Ge companies under each information dimension, and based on predetermined similarity measure rule, using the overall similarity as the similarity between the Liang Ge companies.The present invention improves the accuracy rate calculated similarity between company.

Description

The computational methods and device of company's similarity
Technical field
The present invention relates to field of computer technology, more particularly to a kind of computational methods and device of company's similarity.
Background technology
At present, it is frequently necessary to use the similarity between the similarity such as company between attribute in data mining work.It is existing The mode of similarity in the field of business for calculating two companies is mainly based upon character and word, similar such as Pingan Insurance and safety science and technology Degree will be higher than the similarity of safety financial technology and preceding extra large reference, because Pingan Insurance and safety science and technology are all containing identical Word.However, the existing similarity measure scheme based on character and word generally can not reflect the real similarity of company, for example, Shenzhen Qian Hai references center limited company belongs to financial technology Advisory Co., Ltd of Shenzhen Ping'an, this two company it is similar Degree should higher, and the similarity measure scheme based on character and word, the similarity of this two company is low-down.Therefore, The similarity for how accurately calculating different company has become a technical problem urgently to be resolved hurrily.
The content of the invention
It is a primary object of the present invention to provide a kind of computational methods and device of company's similarity, it is intended to accurately calculate The similarity of different company.
To achieve the above object, the computational methods of a kind of company's similarity provided by the invention, the described method includes following Step:
If the similarity measure request to Liang Ge companies is received, according to the Liang Ge companies in each information pre-established Child node position in the message structure tree of dimension, calculates similarity of the Liang Ge companies under each information dimension respectively; Wherein, described information structure tree is every according to the company information of acquisition to obtain company information from predetermined database The message structure tree with host node and child node that one predetermined information dimension is established;
According to similarity of the Liang Ge companies under each information dimension, and based on predetermined similarity measure rule Overall similarity of the Liang Ge companies under all information dimensions is calculated, using the overall similarity as between the Liang Ge companies Similarity.
Preferably, it is described calculate the Liang Ge companies respectively under each information dimension similarity the step of include:
In the message structure tree of an information dimension, the child node position of a company from the Liang Ge companies is calculated The child node and/or the number of nodes of host node that node path to the child node position of another company is included, will calculate Similarity of the number of nodes as the Liang Ge companies under the information dimension.
Preferably, the predetermined similarity measure rule is:
Similarity of the Liang Ge companies under each information dimension is weighted according to predetermined similarity weight Calculate, to obtain overall similarity of the Liang Ge companies under all information dimensions.
Preferably, the predetermined similarity measure rule is:
Similarity of the Liang Ge companies under each information dimension is averaged to obtain the Liang Ge companies in all letters Cease the overall similarity under dimension.
Preferably, the predetermined database includes the company information data storehouse and/or securities trading of the administration for industry and commerce Company information data storehouse;Described information dimension includes trade classification, holding relation and/or occurrences in human life structure.
In addition, to achieve the above object, the present invention also provides a kind of computing device of company's similarity, the company are similar The computing device of degree includes:
First computing module, if for receiving the similarity measure request to Liang Ge companies, exists according to the Liang Ge companies Child node position in the message structure tree of each information dimension pre-established, calculates the Liang Ge companies in each letter respectively Cease the similarity under dimension;Wherein, described information structure tree is obtains company information from predetermined database, according to obtaining The company information taken, is the message structure tree with host node and child node that each predetermined information dimension is established;
Second computing module, for the similarity according to the Liang Ge companies under each information dimension, and based on true in advance Fixed similarity measure rule calculates overall similarity of the Liang Ge companies under all information dimensions, by the overall similarity As the similarity between the Liang Ge companies.
Preferably, first computing module is additionally operable to:
In the message structure tree of an information dimension, the child node position of a company from the Liang Ge companies is calculated The child node and/or the number of nodes of host node that node path to the child node position of another company is included, will calculate Similarity of the number of nodes as the Liang Ge companies under the information dimension.
Preferably, the predetermined similarity measure rule is:
Similarity of the Liang Ge companies under each information dimension is weighted according to predetermined similarity weight Calculate, to obtain overall similarity of the Liang Ge companies under all information dimensions.
Preferably, the predetermined similarity measure rule is:
Similarity of the Liang Ge companies under each information dimension is averaged to obtain the Liang Ge companies in all letters Cease the overall similarity under dimension.
Preferably, the predetermined database includes the company information data storehouse and/or securities trading of the administration for industry and commerce Company information data storehouse;Described information dimension includes trade classification, holding relation and/or occurrences in human life structure.
The computational methods and device of company's similarity proposed by the present invention, please to the similarity measure of Liang Ge companies receiving When asking, which is calculated under all information dimensions by similarity of the Liang Ge companies under each information dimension Overall similarity, to obtain the similarity between the Liang Ge companies.Due in the similarity between calculating Liang Ge companies, no Character and the word of two Business Names are only based only on to calculate, moreover it is possible to according to Liang Ge companies under each different information dimensions Similarity more deeply, exactly calculate the overall similarities of the Liang Ge companies, improve to similarity between company The accuracy rate calculated.
Brief description of the drawings
Fig. 1 is the flow diagram of one embodiment of computational methods of company's similarity of the present invention;
Fig. 2 is the high-level schematic functional block diagram of one embodiment of computing device of company's similarity of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Embodiment
In order to make technical problems, technical solutions and advantages to be solved clearer, clear, tie below Drawings and examples are closed, the present invention will be described in further detail.It should be appreciated that specific embodiment described herein is only To explain the present invention, it is not intended to limit the present invention.
The present invention provides a kind of computational methods of company's similarity.
With reference to Fig. 1, Fig. 1 is the flow diagram of one embodiment of computational methods of company's similarity of the present invention.
In one embodiment, the computational methods of the said firm's similarity include:
Step S10, if receive to the request of the similarity measures of Liang Ge companies, is pre-establishing according to the Liang Ge companies Child node position in the message structure tree of each information dimension, calculates the Liang Ge companies under each information dimension respectively Similarity;Wherein, described information structure tree is believed to obtain company information from predetermined database according to the enterprise of acquisition Breath, is the message structure tree with host node and child node that each predetermined information dimension is established;
In the present embodiment, the request that similarity measure is carried out to different company that user sends is received, for example, receiving user Relevant information is inputted in the terminals such as mobile phone, tablet computer, self-help terminal equipment (for example, it is desired to carry out each of similarity measure A Business Name etc.) send afterwards similarity measure request, such as reception user in mobile phone, tablet computer, self-help terminal equipment Inputting after relevant information the similarity measure sent in terminal in preassembled company's similarity measure application APP please Ask, or receive after user inputs relevant information on browser in the terminals such as mobile phone, tablet computer, self-help terminal equipment The similarity measure request sent.
It should be noted that for the ease of describing, in the present embodiment only exemplified by carrying out similarity measure to Liang Ge companies To be specifically described, the process that multiple companies are carried out with similarity measure can refer to the mistake that similarity measure is carried out to Liang Ge companies Journey, this will not be repeated here.
, can be each what is pre-established according to the Liang Ge companies after receiving to the similarity measure request of Liang Ge companies Child node position in the message structure tree of information dimension, it is similar under each information dimension to calculate the Liang Ge companies respectively Degree, can such as be calculated according to child node position of the Liang Ge companies in the message structure tree of each information dimension pre-established should Number of nodes size that the node path of Liang Ge companies is far and near, passes through etc., and tieed up in this, as the Liang Ge companies in each information Similarity under degree.Wherein, described information structure tree to establish process as follows:
From predetermined database (for example, the company information data storehouse of the administration for industry and commerce, the company information of stock exchange Database) in obtain company information (for example, Business Name, the industry, main management scope, occurrences in human life structural information, shareholder's information Deng), it is each predetermined information dimension (for example, trade classification, holding relation, occurrences in human life according to the company information of acquisition The information such as structure dimension) establish a message structure tree with host node and child node.
For example, following is to establish the specific of message structure tree with host node and child node for different information dimensions Process:
1st, it is trade classification dimension for information dimension, the sector classification dimension is host node, corresponding the sector point Tree structure division rule under class dimension is:It is level-one child node by the industry class that all industries are divided into default quantity;Pin It is subordinate's child node that sub-industry classification step by step is carried out to each industry class.For example, if building trade is host node, premises It is real estate row to produce the child node that industry, bridge construction industry etc. are building trade, industrial estate industry, commercial real estate industry etc. The child node of industry, corresponding afterbody child node is each company, wherein, next stage child node is higher level's host node or higher level The branch of child node.
2nd, it is holding relation dimension for information dimension, which is host node, the corresponding holding pass It is that tree structure division rule under dimension is:The next stage child node of host node is public for the first order do not controlled interest by other companies Department;The next stage child node of first order company is by its holding second level company;Next child node of second level company is quilt Its holding third level company;And so on carry out the extension of subordinate child node.Wherein, the company to be controlled interest jointly by multiple companies Contacting for the superior and the subordinate's node can be formed as common node and multiple holding companies.
3rd, it is occurrences in human life structural dimension for information dimension, which is host node, the corresponding occurrences in human life knot Tree structure division rule is under structure dimension:The next stage child node of host node is to appoint predetermined position (example in each company Such as, the position on the middle and senior level such as CEO, Chief Financial Officer, sales director, Executive Director, executive vice president) first order custodian Member;First order tenure company that the next stage child node of first order administrative staff is held a post for it;The first order tenure company it is next Child node is the second level administrative staff of the predetermined position in addition to the first order administrative staff of higher level's child node inside it;The Second level tenure company that the next stage child node of two level administrative staff is held a post for it;The next stage section of second level tenure company Point is the 3rd of all management jobs inside it in addition to the first order administrative staff and second level administrative staff of higher level's child node Level administrative staff;The next stage child node of third level administrative staff is held a post company for third level of its tenure, with this to all duties Analogize the extension for carrying out subordinate's child node in position.
Step S20, according to similarity of the Liang Ge companies under each information dimension, and is based on predetermined similarity Computation rule calculates overall similarity of the Liang Ge companies under all information dimensions, using the overall similarity as this two Similarity between company.
Counted according to child node position of the Liang Ge companies in the message structure tree of each information dimension pre-established After calculating similarity of the Liang Ge companies under each information dimension, predetermined similarity measure rule meter can be also based on Calculate overall similarity of the Liang Ge companies under all information dimensions.If classifying for example, the Liang Ge companies belong to same industry, Then can emphasis with reference to similarity of the Liang Ge companies under trade classification dimension this information dimension come determine the Liang Ge companies it Between overall similarity, namely the similarity of the final Liang Ge companies.
The present embodiment is tieed up when receiving the similarity measure request to Liang Ge companies by the Liang Ge companies in each information Similarity under degree calculates overall similarity of the Liang Ge companies under all information dimensions, with obtain the Liang Ge companies it Between similarity.Due in the similarity between calculating Liang Ge companies, being not only based only on the character of two Business Names Calculated with word, moreover it is possible to calculated more deeply, exactly according to similarity of the Liang Ge companies under each different information dimensions Go out the overall similarity of the Liang Ge companies, improve the accuracy rate calculated similarity between company.
Further, in other embodiments, the above-mentioned phase for calculating the Liang Ge companies respectively under each information dimension The step of seemingly spending can include:
In the message structure tree of an information dimension, the child node position of a company from the Liang Ge companies is calculated The child node and/or the number of nodes of host node that node path to the child node position of another company is included, will calculate Similarity of the number of nodes as the Liang Ge companies under the information dimension.
In the present embodiment, for similarity of the Liang Ge companies under each information dimension, this two are first obtained respectively Child node position of the company in the message structure tree of same information dimension, the Liang Ge companies are calculated further according to child node position The degree of association in the message structure tree of same information dimension.For example, the message structure in same information dimension can be calculated In tree, from the node path institute of the child node position of child node position to another company of a company in the Liang Ge companies Comprising child node and/or host node number of nodes, the number of the number of nodes can reflect the Liang Ge companies same Degree of association height in the message structure tree of a information dimension, it is similar under the information dimension also to embody the Liang Ge companies Spend size, as number of nodes is more, illustrate the degree of association of the Liang Ge companies in the message structure tree of same information dimension compared with Low, then similarity of the Liang Ge companies under the information dimension is relatively low;Number of nodes is fewer, illustrates the Liang Ge companies same The degree of association in the message structure tree of a information dimension is higher, then similarity of the Liang Ge companies under the information dimension also compared with It is high.
Further, in other embodiments, the predetermined similarity measure rule is:
Similarity of the Liang Ge companies under each information dimension is weighted according to predetermined similarity weight Calculate, to obtain overall similarity of the Liang Ge companies under all information dimensions.
In the present embodiment, in the son according to the Liang Ge companies in the message structure tree of each information dimension pre-established , can also be by the Liang Ge companies in each letter after node location calculates similarity of the Liang Ge companies under each information dimension Similarity under breath dimension is weighted according to predetermined similarity weight, to calculate the Liang Ge companies all Overall similarity under information dimension.For example, can be in advance to the corresponding similarity weight of different information dimension sets, if needing to count Calculating the company of the existing phase same industry in each company of similarity also has the company of different industries, then can be by each company in industry The corresponding similarity weight of this information dimension of classification dimension is arranged to weight limit, in this way, calculating the entirety of each company The similarity between company in the mutually same industry can be more accurately identified during similarity, avoids by mistake calculating the company of different industries For the big company of similarity, the accuracy rate of company's similarity measure further increasing.
In one embodiment, can come to be tieed up in each information according to the Liang Ge companies by following calculating formula of similarity Similarity measure under degree goes out overall similarity of the Liang Ge companies under all information dimensions:
Wherein, S (V) represents overall similarity of the Liang Ge companies under all information dimensions, HiRepresent i-th of information dimension The predetermined similarity weight of degree, ViRepresent the similarity of the Liang Ge companies under i-th of information dimension.
Further, in other embodiments, the predetermined similarity measure rule is:
Similarity of the Liang Ge companies under each information dimension is averaged to obtain the Liang Ge companies in all letters Cease the overall similarity under dimension.
In the present embodiment, in the son according to the Liang Ge companies in the message structure tree of each information dimension pre-established , can also be by the Liang Ge companies in each letter after node location calculates similarity of the Liang Ge companies under each information dimension Similarity under breath dimension is averaged, to calculate overall similarity of the Liang Ge companies under all information dimensions.Should The average value of similarity of the Liang Ge companies under each information dimension similarity as a whole, can balance each information dimension to most The influence of whole overall similarity, makes the overall similarity finally obtained more accurately, rationally, and then improves company's similarity meter The accuracy rate of calculation.
The present invention further provides a kind of computing device of company's similarity.
With reference to Fig. 2, Fig. 2 is the high-level schematic functional block diagram of one embodiment of computing device of company's similarity of the present invention.
In one embodiment, the computing device of the said firm's similarity includes:
First computing module 01, if for receiving the similarity measure request to Liang Ge companies, according to the Liang Ge companies Child node position in the message structure tree of each information dimension pre-established, calculates the Liang Ge companies each respectively Similarity under information dimension;Wherein, described information structure tree is to obtain company information from predetermined database, according to The company information of acquisition, is the message structure with host node and child node that each predetermined information dimension is established Tree;
In the present embodiment, the request that similarity measure is carried out to different company that user sends is received, for example, receiving user Relevant information is inputted in the terminals such as mobile phone, tablet computer, self-help terminal equipment (for example, it is desired to carry out each of similarity measure A Business Name etc.) send afterwards similarity measure request, such as reception user in mobile phone, tablet computer, self-help terminal equipment Inputting after relevant information the similarity measure sent in terminal in preassembled company's similarity measure application APP please Ask, or receive after user inputs relevant information on browser in the terminals such as mobile phone, tablet computer, self-help terminal equipment The similarity measure request sent.
It should be noted that for the ease of describing, in the present embodiment only exemplified by carrying out similarity measure to Liang Ge companies To be specifically described, the process that multiple companies are carried out with similarity measure can refer to the mistake that similarity measure is carried out to Liang Ge companies Journey, this will not be repeated here.
, can be each what is pre-established according to the Liang Ge companies after receiving to the similarity measure request of Liang Ge companies Child node position in the message structure tree of information dimension, it is similar under each information dimension to calculate the Liang Ge companies respectively Degree, can such as be calculated according to child node position of the Liang Ge companies in the message structure tree of each information dimension pre-established should Number of nodes size that the node path of Liang Ge companies is far and near, passes through etc., and tieed up in this, as the Liang Ge companies in each information Similarity under degree.Wherein, described information structure tree to establish process as follows:
From predetermined database (for example, the company information data storehouse of the administration for industry and commerce, the company information of stock exchange Database) in obtain company information (for example, Business Name, the industry, main management scope, occurrences in human life structural information, shareholder's information Deng), it is each predetermined information dimension (for example, trade classification, holding relation, occurrences in human life according to the company information of acquisition The information such as structure dimension) establish a message structure tree with host node and child node.
For example, following is to establish the specific of message structure tree with host node and child node for different information dimensions Process:
1st, it is trade classification dimension for information dimension, the sector classification dimension is host node, corresponding the sector point Tree structure division rule under class dimension is:It is level-one child node by the industry class that all industries are divided into default quantity;Pin It is subordinate's child node that sub-industry classification step by step is carried out to each industry class.For example, if building trade is host node, premises It is real estate row to produce the child node that industry, bridge construction industry etc. are building trade, industrial estate industry, commercial real estate industry etc. The child node of industry, corresponding afterbody child node is each company, wherein, next stage child node is higher level's host node or higher level The branch of child node.
2nd, it is holding relation dimension for information dimension, which is host node, the corresponding holding pass It is that tree structure division rule under dimension is:The next stage child node of host node is public for the first order do not controlled interest by other companies Department;The next stage child node of first order company is by its holding second level company;Next child node of second level company is quilt Its holding third level company;And so on carry out the extension of subordinate child node.Wherein, the company to be controlled interest jointly by multiple companies Contacting for the superior and the subordinate's node can be formed as common node and multiple holding companies.
3rd, it is occurrences in human life structural dimension for information dimension, which is host node, the corresponding occurrences in human life knot Tree structure division rule is under structure dimension:The next stage child node of host node is to appoint predetermined position (example in each company Such as, the position on the middle and senior level such as CEO, Chief Financial Officer, sales director, Executive Director, executive vice president) first order custodian Member;First order tenure company that the next stage child node of first order administrative staff is held a post for it;The first order tenure company it is next Child node is the second level administrative staff of the predetermined position in addition to the first order administrative staff of higher level's child node inside it;The Second level tenure company that the next stage child node of two level administrative staff is held a post for it;The next stage section of second level tenure company Point is the 3rd of all management jobs inside it in addition to the first order administrative staff and second level administrative staff of higher level's child node Level administrative staff;The next stage child node of third level administrative staff is held a post company for third level of its tenure, with this to all duties Analogize the extension for carrying out subordinate's child node in position.
Second computing module 02, for the similarity according to the Liang Ge companies under each information dimension, and based on advance Definite similarity measure rule calculates overall similarity of the Liang Ge companies under all information dimensions, this is overall similar Degree is as the similarity between the Liang Ge companies.
Counted according to child node position of the Liang Ge companies in the message structure tree of each information dimension pre-established After calculating similarity of the Liang Ge companies under each information dimension, predetermined similarity measure rule meter can be also based on Calculate overall similarity of the Liang Ge companies under all information dimensions.If classifying for example, the Liang Ge companies belong to same industry, Then can emphasis with reference to similarity of the Liang Ge companies under trade classification dimension this information dimension come determine the Liang Ge companies it Between overall similarity, namely the similarity of the final Liang Ge companies.
The present embodiment is tieed up when receiving the similarity measure request to Liang Ge companies by the Liang Ge companies in each information Similarity under degree calculates overall similarity of the Liang Ge companies under all information dimensions, with obtain the Liang Ge companies it Between similarity.Due in the similarity between calculating Liang Ge companies, being not only based only on the character of two Business Names Calculated with word, moreover it is possible to calculated more deeply, exactly according to similarity of the Liang Ge companies under each different information dimensions Go out the overall similarity of the Liang Ge companies, improve the accuracy rate calculated similarity between company.
Further, in other embodiments, above-mentioned first computing module 01 can be also used for:
In the message structure tree of an information dimension, the child node position of a company from the Liang Ge companies is calculated The child node and/or the number of nodes of host node that node path to the child node position of another company is included, will calculate Similarity of the number of nodes as the Liang Ge companies under the information dimension.
In the present embodiment, for similarity of the Liang Ge companies under each information dimension, this two are first obtained respectively Child node position of the company in the message structure tree of same information dimension, the Liang Ge companies are calculated further according to child node position The degree of association in the message structure tree of same information dimension.For example, the message structure in same information dimension can be calculated In tree, from the node path institute of the child node position of child node position to another company of a company in the Liang Ge companies Comprising child node and/or host node number of nodes, the number of the number of nodes can reflect the Liang Ge companies same Degree of association height in the message structure tree of a information dimension, it is similar under the information dimension also to embody the Liang Ge companies Spend size, as number of nodes is more, illustrate the degree of association of the Liang Ge companies in the message structure tree of same information dimension compared with Low, then similarity of the Liang Ge companies under the information dimension is relatively low;Number of nodes is fewer, illustrates the Liang Ge companies same The degree of association in the message structure tree of a information dimension is higher, then similarity of the Liang Ge companies under the information dimension also compared with It is high.
Further, in other embodiments, the predetermined similarity measure rule is:
Similarity of the Liang Ge companies under each information dimension is weighted according to predetermined similarity weight Calculate, to obtain overall similarity of the Liang Ge companies under all information dimensions.
In the present embodiment, in the son according to the Liang Ge companies in the message structure tree of each information dimension pre-established , can also be by the Liang Ge companies in each letter after node location calculates similarity of the Liang Ge companies under each information dimension Similarity under breath dimension is weighted according to predetermined similarity weight, to calculate the Liang Ge companies all Overall similarity under information dimension.For example, can be in advance to the corresponding similarity weight of different information dimension sets, if needing to count Calculating the company of the existing phase same industry in each company of similarity also has the company of different industries, then can be by each company in industry The corresponding similarity weight of this information dimension of classification dimension is arranged to weight limit, in this way, calculating the entirety of each company The similarity between company in the mutually same industry can be more accurately identified during similarity, avoids by mistake calculating the company of different industries For the big company of similarity, the accuracy rate of company's similarity measure further increasing.
In one embodiment, can come to be tieed up in each information according to the Liang Ge companies by following calculating formula of similarity Similarity measure under degree goes out overall similarity of the Liang Ge companies under all information dimensions:
Wherein, S (V) represents overall similarity of the Liang Ge companies under all information dimensions, HiRepresent i-th of information dimension The predetermined similarity weight of degree, ViRepresent the similarity of the Liang Ge companies under i-th of information dimension.
Further, in other embodiments, the predetermined similarity measure rule is:
Similarity of the Liang Ge companies under each information dimension is averaged to obtain the Liang Ge companies in all letters Cease the overall similarity under dimension.
In the present embodiment, in the son according to the Liang Ge companies in the message structure tree of each information dimension pre-established , can also be by the Liang Ge companies in each letter after node location calculates similarity of the Liang Ge companies under each information dimension Similarity under breath dimension is averaged, to calculate overall similarity of the Liang Ge companies under all information dimensions.Should The average value of similarity of the Liang Ge companies under each information dimension similarity as a whole, can balance each information dimension to most The influence of whole overall similarity, makes the overall similarity finally obtained more accurately, rationally, and then improves company's similarity meter The accuracy rate of calculation.
It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row His property includes, so that process, method, article or device including a series of elements not only include those key elements, and And other elements that are not explicitly listed are further included, or further include as this process, method, article or device institute inherently Key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including this Also there are other identical element in the process of key element, method, article or device.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to realized by hardware, but very much In the case of the former be more preferably embodiment.Based on such understanding, technical scheme is substantially in other words to existing The part that technology contributes can be embodied in the form of software product, which is stored in a storage In medium (such as ROM/RAM, magnetic disc, CD), including some instructions are used so that a station terminal equipment (can be mobile phone, calculate Machine, server, air conditioner, or network equipment etc.) perform method described in each embodiment of the present invention.
Above by reference to the preferred embodiment of the present invention has been illustrated, not thereby limit to the interest field of the present invention.On State that sequence number of the embodiment of the present invention is for illustration only, do not represent the quality of embodiment.Patrolled in addition, though showing in flow charts Order is collected, but in some cases, can be with the steps shown or described are performed in an order that is different from the one herein.
Those skilled in the art do not depart from the scope of the present invention and essence, can have a variety of flexible programs to realize the present invention, It can be used for another embodiment for example as the feature of one embodiment and obtain another embodiment.All technologies with the present invention The all any modification, equivalent and improvement made within design, should all be within the interest field of the present invention.

Claims (10)

1. a kind of computational methods of company's similarity, it is characterised in that the described method comprises the following steps:
If the similarity measure request to Liang Ge companies is received, according to the Liang Ge companies in each information dimension pre-established Message structure tree in child node position, calculate similarity of the Liang Ge companies under each information dimension respectively;Wherein, Described information structure tree is each according to the company information of acquisition to obtain company information from predetermined database The message structure tree with host node and child node that predetermined information dimension is established;
Calculated according to similarity of the Liang Ge companies under each information dimension, and based on predetermined similarity measure rule Go out overall similarity of the Liang Ge companies under all information dimensions, using the overall similarity as the phase between the Liang Ge companies Like degree.
2. the computational methods of company's similarity as claimed in claim 1, it is characterised in that described to calculate two public affairs respectively The step of taking charge of the similarity under each information dimension includes:
In the message structure tree of an information dimension, the child node position of a company from the Liang Ge companies is calculated to another The child node and/or the number of nodes of host node that the node path of the child node position of one company is included, by the section of calculating Similarity of the point quantity as the Liang Ge companies under the information dimension.
3. the computational methods of company's similarity as claimed in claim 1 or 2, it is characterised in that described predetermined similar Spending computation rule is:
Similarity of the Liang Ge companies under each information dimension is weighted according to predetermined similarity weight, To obtain overall similarity of the Liang Ge companies under all information dimensions.
4. the computational methods of company's similarity as claimed in claim 1 or 2, it is characterised in that described predetermined similar Spending computation rule is:
Similarity of the Liang Ge companies under each information dimension is averaged and is tieed up with obtaining the Liang Ge companies in all information Overall similarity under degree.
5. the computational methods of company's similarity as claimed in claim 1 or 2, it is characterised in that the predetermined data Storehouse includes the company information data storehouse of the administration for industry and commerce and/or the company information data storehouse of stock exchange;Described information dimension bag Include trade classification, holding relation and/or occurrences in human life structure.
6. a kind of computing device of company's similarity, it is characterised in that the computing device of company's similarity includes:
First computing module, if for receiving the similarity measure request to Liang Ge companies, according to the Liang Ge companies advance Child node position in the message structure tree for each information dimension established, calculates the Liang Ge companies and is tieed up in each information respectively Similarity under degree;Wherein, described information structure tree is obtains company information from predetermined database, according to acquisition Company information, is the message structure tree with host node and child node that each predetermined information dimension is established;
Second computing module, for the similarity according to the Liang Ge companies under each information dimension, and based on predetermined Similarity measure rule calculates overall similarity of the Liang Ge companies under all information dimensions, using the overall similarity as Similarity between the Liang Ge companies.
7. the computing device of company's similarity as claimed in claim 6, it is characterised in that first computing module is also used In:
In the message structure tree of an information dimension, the child node position of a company from the Liang Ge companies is calculated to another The child node and/or the number of nodes of host node that the node path of the child node position of one company is included, by the section of calculating Similarity of the point quantity as the Liang Ge companies under the information dimension.
8. the computing device of company's similarity as claimed in claims 6 or 7, it is characterised in that described predetermined similar Spending computation rule is:
Similarity of the Liang Ge companies under each information dimension is weighted according to predetermined similarity weight, To obtain overall similarity of the Liang Ge companies under all information dimensions.
9. the computing device of company's similarity as claimed in claims 6 or 7, it is characterised in that described predetermined similar Spending computation rule is:
Similarity of the Liang Ge companies under each information dimension is averaged and is tieed up with obtaining the Liang Ge companies in all information Overall similarity under degree.
10. the computing device of company's similarity as claimed in claims 6 or 7, it is characterised in that the predetermined data Storehouse includes the company information data storehouse of the administration for industry and commerce and/or the company information data storehouse of stock exchange;Described information dimension bag Include trade classification, holding relation and/or occurrences in human life structure.
CN201611265737.1A 2016-12-30 2016-12-30 The computational methods and device of company's similarity Pending CN107908626A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611265737.1A CN107908626A (en) 2016-12-30 2016-12-30 The computational methods and device of company's similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611265737.1A CN107908626A (en) 2016-12-30 2016-12-30 The computational methods and device of company's similarity

Publications (1)

Publication Number Publication Date
CN107908626A true CN107908626A (en) 2018-04-13

Family

ID=61839954

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611265737.1A Pending CN107908626A (en) 2016-12-30 2016-12-30 The computational methods and device of company's similarity

Country Status (1)

Country Link
CN (1) CN107908626A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738864A (en) * 2020-08-14 2020-10-02 支付宝(杭州)信息技术有限公司 Method, device and equipment for identifying group to which business entity belongs
CN112417879A (en) * 2020-11-25 2021-02-26 上海水滴征信服务有限公司 Determining business attribute similarity, rename object determination
CN112632954A (en) * 2020-12-29 2021-04-09 中译语通科技股份有限公司 Method and device for acquiring technical similarity of mechanisms

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005506618A (en) * 2001-10-18 2005-03-03 ビーイーエイ システムズ, インコーポレイテッド Application view components for system integration
CN105138652A (en) * 2015-08-28 2015-12-09 山东合天智汇信息技术有限公司 Enterprise association recognition method and system
CN105183767A (en) * 2015-07-31 2015-12-23 山东大学 Enterprise network-based enterprise business similarity calculation method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005506618A (en) * 2001-10-18 2005-03-03 ビーイーエイ システムズ, インコーポレイテッド Application view components for system integration
CN105183767A (en) * 2015-07-31 2015-12-23 山东大学 Enterprise network-based enterprise business similarity calculation method and system
CN105138652A (en) * 2015-08-28 2015-12-09 山东合天智汇信息技术有限公司 Enterprise association recognition method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘景方等: "一种改进的本体概念语义相似度算法研究", 《武汉理工大学学报》 *
张毅辉等: "关联企业认定标准分析", 《扬州大学税务学院学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738864A (en) * 2020-08-14 2020-10-02 支付宝(杭州)信息技术有限公司 Method, device and equipment for identifying group to which business entity belongs
CN111738864B (en) * 2020-08-14 2020-12-18 支付宝(杭州)信息技术有限公司 Method, device and equipment for identifying group to which business entity belongs
CN112417879A (en) * 2020-11-25 2021-02-26 上海水滴征信服务有限公司 Determining business attribute similarity, rename object determination
CN112632954A (en) * 2020-12-29 2021-04-09 中译语通科技股份有限公司 Method and device for acquiring technical similarity of mechanisms

Similar Documents

Publication Publication Date Title
WO2020119272A1 (en) Risk identification model training method and apparatus, and server
CN108280104B (en) Method and device for extracting characteristic information of target object
US20190197570A1 (en) Location-based analytic platform and methods
CN107729519B (en) Multi-source multi-dimensional data-based evaluation method and device, and terminal
Kshetri Artificial Intelligence in Developing Countries.
CN104050196B (en) A kind of interest point data redundant detecting method and device
US11170436B2 (en) Credit scoring method and server
CN103761254B (en) Method for matching and recommending service themes in various fields
Pan et al. Monitoring and forecasting tourist activities with big data
CN104463548B (en) A kind of acknowledgement of consignment Quantitatively Selecting method under multifactor impact
US20210125131A1 (en) Electronic device, method for constructing scoring model of retail outlets, system, and computer readable medium
CN106997396B (en) Supply and demand information interconnection method and system
CN110135978A (en) User's financial risks appraisal procedure, device, electronic equipment and readable medium
CN107908626A (en) The computational methods and device of company's similarity
US20150120731A1 (en) Preference based clustering
CN109447103B (en) Big data classification method, device and equipment based on hard clustering algorithm
CN110363636A (en) Risk of fraud recognition methods and device based on relational network
CN110148053A (en) User's credit line assessment method, apparatus, electronic equipment and readable medium
CN113095408A (en) Risk determination method and device and server
US20220058435A1 (en) Data classification method and system, and classifier training method and system
Cui et al. The inventory routing problem under uncertainty
CN115545103A (en) Abnormal data identification method, label identification method and abnormal data identification device
CN112560105B (en) Joint modeling method and device for protecting multi-party data privacy
CN113850669A (en) User grouping method and device, computer equipment and computer readable storage medium
Aoki et al. Identifying sinks and sources of human flows: A new approach to characterizing urban structures

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20180528

Address after: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Applicant after: Shenzhen one ledger Intelligent Technology Co., Ltd.

Address before: 200030 Xuhui District, Shanghai Kai Bin Road 166, 9, 10 level.

Applicant before: Shanghai Financial Technologies Ltd

REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1251054

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180413