CN113485987A - Enterprise information tag generation method and device - Google Patents

Enterprise information tag generation method and device Download PDF

Info

Publication number
CN113485987A
CN113485987A CN202110744350.9A CN202110744350A CN113485987A CN 113485987 A CN113485987 A CN 113485987A CN 202110744350 A CN202110744350 A CN 202110744350A CN 113485987 A CN113485987 A CN 113485987A
Authority
CN
China
Prior art keywords
enterprise
data
information
relationship
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110744350.9A
Other languages
Chinese (zh)
Inventor
陈少冬
刘洋
江凌志
林一鸣
李睿军
李汉雄
周莉莉
张祎琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202110744350.9A priority Critical patent/CN113485987A/en
Publication of CN113485987A publication Critical patent/CN113485987A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Abstract

The invention discloses a method and a device for generating enterprise information labels, wherein the method comprises the following steps: acquiring original data, wherein the original data comprises enterprise information data acquired from a database related information table, network public resources and an enterprise supplementary recording data channel; according to the original data, establishing enterprise relation data; importing the enterprise relationship data into a graph database to generate an enterprise relationship graph; and generating enterprise information labels for the enterprises according to the enterprise relation maps. The invention relates to the field of big data technology, and can generate enterprise information labels according to an enterprise relation map, and can quickly and visually know effective information in enterprise information data according to the enterprise information labels, so that the induction and summary of the enterprise information data are realized, and the efficiency of acquiring the effective information in the enterprise information data is improved.

Description

Enterprise information tag generation method and device
Technical Field
The invention relates to the field of big data technology, in particular to a method and a device for generating enterprise information labels.
Background
This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
As a huge economic organization, the enterprise covers a huge amount of enterprise information. In some cases, when some enterprises provide services for other enterprises, enterprise information data of other enterprises need to be collected, and the enterprise information data is summarized and summarized as needed to obtain effective information, and suitable services are provided for the enterprises according to the effective information. The enterprise information data of the enterprise needs to be summarized and sorted, a large amount of manpower and material resources are consumed, and effective information cannot be obtained intuitively and quickly.
Disclosure of Invention
The embodiment of the invention provides an enterprise information tag generation method, which is used for solving the problems that in the prior art, a large amount of manpower and material resources are consumed for carrying out induction and arrangement on enterprise information data of an enterprise, and effective information cannot be obtained intuitively and quickly, and comprises the following steps:
acquiring original data, wherein the original data comprises enterprise information data acquired from a database related information table, network public resources and an enterprise supplementary recording data channel;
according to the original data, establishing enterprise relation data;
importing the enterprise relationship data into a graph database to generate an enterprise relationship graph;
and generating enterprise information labels for the enterprises according to the enterprise relation maps.
In one possible embodiment, the enterprise information data includes at least one of: enterprise shareholder data, enterprise practitioner data, business income data, total asset data, and enterprise listing data.
In one possible embodiment, obtaining raw data includes:
acquiring enterprise information data from listed financial newspaper information, network public data and news by a crawler method;
acquiring enterprise information data by traversing the related information table of the database;
and enterprise information data fed back by the enterprise is obtained through the enterprise supplementary recording data channel.
In one possible implementation, the obtaining of the enterprise information data fed back by the enterprise through the enterprise supplementary recording data channel includes:
generating necessary filling record content prompt information and selecting filling record content prompt information according to the required enterprise information data;
and generating an enterprise additional recording data channel according to the necessary additional recording content prompt information and the selected additional recording content prompt information, and receiving enterprise information data fed back by the enterprise according to the enterprise additional recording data channel.
In one possible embodiment, constructing business relationship data from raw data comprises:
cleaning the original data, and removing repeated enterprise information data and invalid enterprise information data in the original data;
constructing a model according to the cleaned original data and the pre-trained enterprise relationship data to construct enterprise relationship data; the enterprise relational data construction model is a model for mining and establishing a relational structure among various enterprise data according to machine learning.
In one possible embodiment, the method further comprises:
acquiring enterprise information data from the Internet by a crawler method;
and carrying out relation data labeling on the enterprise information data to obtain a training sample, and training the enterprise relation data construction model by using the training sample.
In one possible implementation, the cleaning of the original data to remove the duplicate enterprise information data and other data except the enterprise information data in the original data includes:
setting different weights for enterprise information data acquired by a database related information table, network public resources and an enterprise additional recording data channel respectively;
comparing enterprise information data acquired from a database related information table, network public resources and an enterprise additional entry data channel, removing repeated enterprise information data, and retaining the enterprise information data of the same type but inconsistent with the enterprise information data with the highest weight in the type to obtain effective data;
and storing the effective data according to a preset data structure by using a word segmentation tool.
In one possible implementation, constructing the enterprise relational data according to the cleaned original data and the pre-trained enterprise relational data construction model, includes:
classifying the cleaned original data by utilizing an enterprise relational data construction model;
and according to the classification result, performing level classification on the data in each class, and generating a subordinate relationship among the data to obtain enterprise relationship data.
In one possible implementation, importing enterprise relationship data into a graph database to generate an enterprise relationship graph, including:
importing the enterprise relational data into a Neo4j database, and generating a node corresponding to each data and a node name according to a category division result in the enterprise relational data and data contained in each category;
and aiming at each category, connecting each node according to the grade of each data in the category and the affiliation among the data to obtain the enterprise relationship graph.
In one possible embodiment, generating enterprise information tags for an enterprise according to an enterprise relationship graph includes:
and generating the enterprise information label of the node according to the node name of the node with the lowest level under each category and each subordination relation.
In one possible embodiment, constructing business relationship data from raw data comprises:
cleaning the original data and converting the original data into a preset format;
and importing the cleaned and converted original data into an Oracle database, and constructing enterprise relational data by adopting a relational database technology based on the Oracle database.
In one possible implementation, importing enterprise relationship data into a graph database to generate an enterprise relationship graph, including:
converting the enterprise relation data into a text data format by using an sqluldr2 tool, importing the converted enterprise relation data into a database Neo4j, and generating an enterprise relation map.
In one possible embodiment, constructing business relationship data from raw data comprises:
cleaning and classifying the original data according to a comet-benefit policy to obtain effective classified data;
constructing enterprise relation data according to the effective classification data; the enterprise relation data comprises enterprise scale relation data and enterprise shareholder relation data;
importing the enterprise relationship data into a graph database to generate an enterprise relationship graph, wherein the method comprises the following steps:
importing the enterprise relationship data into a graph database, and generating an enterprise relationship graph according to the enterprise scale relationship data and the enterprise shareholder relationship data; the enterprise relationship map comprises an enterprise shareholder relationship map and an enterprise scale relationship map;
generating enterprise information labels for the enterprises according to the enterprise relationship maps, wherein the method comprises the following steps:
generating enterprise category labels of the enterprises according to the enterprise shareholder relationship graph; wherein, the enterprise category class label includes: headquarters enterprises, growth enterprises;
generating an enterprise scale class label of the enterprise according to the enterprise scale relation graph; wherein, the enterprise stock size class label comprises: large-scale enterprises, medium-scale enterprises, small-scale enterprises and micro enterprises.
The embodiment of the invention also provides an enterprise information tag generation device, which is used for solving the problems that in the prior art, a large amount of manpower and material resources are consumed for summarizing and sorting enterprise information data of an enterprise, and effective information cannot be obtained intuitively and quickly, and the device comprises:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring original data, and the original data comprises enterprise information data acquired from a database related information table, network public resources and an enterprise additional data channel;
the construction module is used for constructing enterprise relation data according to the original data;
the first processing module is used for importing the enterprise relationship data into a graph database to generate an enterprise relationship graph;
and the second processing module is used for generating enterprise information labels for the enterprises according to the enterprise relation maps.
In one possible embodiment, the enterprise information data includes at least one of: enterprise shareholder data, enterprise practitioner data, business income data, total asset data, and enterprise listing data.
In one possible implementation manner, the acquisition module is specifically configured to acquire enterprise information data from listed financial newspaper information, network public data and news through a crawler method; acquiring enterprise information data by traversing the related information table of the database; and enterprise information data fed back by the enterprise is obtained through the enterprise supplementary recording data channel.
In a possible implementation manner, the obtaining module is specifically configured to generate necessary padding entry content prompt information and select the padding entry content prompt information according to the required enterprise information data; and generating an enterprise additional recording data channel according to the necessary additional recording content prompt information and the selected additional recording content prompt information, and receiving enterprise information data fed back by the enterprise according to the enterprise additional recording data channel.
In a possible implementation manner, the construction module is specifically configured to clean original data, and remove repeated enterprise information data and invalid enterprise information data in the original data;
constructing a model according to the cleaned original data and the pre-trained enterprise relationship data to construct enterprise relationship data; the enterprise relational data construction model is a model for mining and establishing a relational structure among various enterprise data according to machine learning.
In one possible embodiment, the method further comprises:
the third processing module is used for acquiring enterprise information data from the Internet by a crawler method;
and carrying out relation data labeling on the enterprise information data to obtain a training sample, and training the enterprise relation data construction model by using the training sample.
In a possible implementation manner, the construction module is specifically configured to set different weights for enterprise information data acquired by a database related information table, a network public resource, and an enterprise supplementary recording data channel, respectively;
comparing enterprise information data acquired from a database related information table, network public resources and an enterprise additional entry data channel, removing repeated enterprise information data, and retaining the enterprise information data of the same type but inconsistent with the enterprise information data with the highest weight in the type to obtain effective data;
and storing the effective data according to a preset data structure by using a word segmentation tool.
In a possible implementation manner, the construction module is specifically configured to classify the cleaned original data by using an enterprise relational data construction model;
and according to the classification result, performing level classification on the data in each class, and generating a subordinate relationship among the data to obtain enterprise relationship data.
In a possible implementation manner, the first processing module is specifically configured to import the enterprise relationship data into a Neo4j database, and generate a node and a node name corresponding to each data according to a classification result in the enterprise relationship data and data included in each class;
and aiming at each category, connecting each node according to the grade of each data in the category and the affiliation among the data to obtain the enterprise relationship graph.
In a possible implementation manner, the second processing module is specifically configured to generate an enterprise information tag of a node according to a node name of a node with a lowest level in each category and each dependency relationship.
In a possible implementation manner, the construction module is specifically configured to clean the original data and convert the original data into a preset format;
and importing the cleaned and converted original data into an Oracle database, and constructing enterprise relational data by adopting a relational database technology based on the Oracle database.
In a possible implementation manner, the first processing module is specifically configured to convert the enterprise relationship data into a text data format by using an sql ildr 2 tool, import the converted enterprise relationship data into a database Neo4j, and generate an enterprise relationship graph.
In a possible implementation manner, the construction module is specifically configured to perform cleaning and classification on original data according to a coma compensation policy to obtain effective classification data;
constructing enterprise relation data according to the effective classification data; the enterprise relation data comprises enterprise scale relation data and enterprise shareholder relation data;
the first processing module is specifically used for importing the enterprise relationship data into a graph database and generating an enterprise relationship graph according to the enterprise scale relationship data and the enterprise shareholder relationship data; the enterprise relationship map comprises an enterprise shareholder relationship map and an enterprise scale relationship map;
the second processing module is specifically used for generating enterprise category labels of the enterprises according to the enterprise shareholder relationship diagram; wherein, the enterprise category class label includes: headquarters enterprises, growth enterprises;
generating an enterprise scale class label of the enterprise according to the enterprise scale relation graph; wherein, the enterprise stock size class label comprises: large-scale enterprises, medium-scale enterprises, small-scale enterprises and micro enterprises.
The embodiment of the invention also provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor realizes the enterprise information label generating method when executing the computer program.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program for executing the above-mentioned enterprise information tag generation method is stored in the computer-readable storage medium.
In the embodiment of the invention, original data is acquired, wherein the original data comprises enterprise information data acquired from a database related information table, network public resources and an enterprise additional data channel; according to the original data, establishing enterprise relation data; importing the enterprise relationship data into a graph database to generate an enterprise relationship graph; generating enterprise information labels for enterprises according to the enterprise relation maps; therefore, the enterprise information labels can be generated according to the enterprise relation map, the effective information in the enterprise information data can be rapidly and visually acquired according to the enterprise information labels, the enterprise information data can be summarized, and the efficiency of acquiring the effective information in the enterprise information data is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:
fig. 1 is a flowchart of an enterprise information tag generation method provided in an embodiment of the present invention;
FIG. 2 is a schematic diagram of enterprise relationship data provided in an embodiment of the present invention;
FIG. 3 is a schematic diagram of an enterprise relationship graph provided in an embodiment of the present invention;
fig. 4 is a diagram illustrating an example of a method for generating an enterprise information tag according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating an enterprise information tag based on a complementary policy and corresponding tag definition rules according to an embodiment of the present invention;
fig. 6 is a schematic diagram of an enterprise information tag generation apparatus provided in an embodiment of the present invention;
fig. 7 is a schematic diagram of a computer device provided in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
The term "and/or" herein merely describes an associative relationship, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.
In the description of the present specification, the terms "comprising," "including," "having," "containing," and the like are used in an open-ended fashion, i.e., to mean including, but not limited to. Reference to the description of the terms "one embodiment," "a particular embodiment," "some embodiments," "for example," etc., means that a particular feature, structure, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. The sequence of steps involved in the embodiments is for illustrative purposes to illustrate the implementation of the present application, and the sequence of steps is not limited and can be adjusted as needed.
Research shows that enterprise information data of an enterprise covers information of the enterprise, the information amount is huge, but under certain conditions, only part of information in the enterprise information data is needed, so that more manpower and material resources are needed to be consumed to summarize and sort a large amount of enterprise information data, actually needed effective information is screened out, and the effective information cannot be obtained quickly and intuitively.
In view of the above research, an embodiment of the present invention provides an enterprise information tag generating method, as shown in fig. 1, including:
s101: acquiring original data, wherein the original data comprises enterprise information data acquired from a database related information table, network public resources and an enterprise supplementary recording data channel;
s102: according to the original data, establishing enterprise relation data;
s103: importing the enterprise relationship data into a graph database to generate an enterprise relationship graph;
s104: and generating enterprise information labels for the enterprises according to the enterprise relation maps.
The embodiment of the invention acquires original data, wherein the original data comprises enterprise information data acquired from a database related information table, network public resources and an enterprise additional data channel; according to the original data, establishing enterprise relation data; importing the enterprise relationship data into a graph database to generate an enterprise relationship graph; generating enterprise information labels for enterprises according to the enterprise relation maps; therefore, the enterprise information labels can be generated according to the enterprise relation map, the effective information in the enterprise information data can be rapidly and visually acquired according to the enterprise information labels, the enterprise information data can be summarized, and the efficiency of acquiring the effective information in the enterprise information data is improved.
The following describes the details of S101 to S104.
For the above S101, the original data includes, for example, enterprise information data obtained from a database-related information table, a network public resource, and an enterprise supplementary recording data channel.
The database-related information table is, for example, an official channel for registering enterprise information data for an enterprise (for example, national enterprise credit data, a building city information center, etc.); the network public resources comprise, for example, public financial information, network public data, news and the like, and the enterprise supplementary recording data channel comprises, for example, links, web pages, web addresses and the like for actively inputting enterprise information data for the enterprise.
Additionally, the enterprise information data includes at least one of: enterprise shareholder data, enterprise practitioner data, business income data, total asset data, and enterprise listing data.
Here, the enterprise shareholder data includes, for example, enterprise sharers, and share proportion data for each sharer; the enterprise practitioner data comprises the number of enterprise practitioners, income of enterprise practitioners, distribution conditions of enterprise practitioners in the enterprise and the like; the business income data includes, for example, total income of an enterprise, income corresponding to each industry of the enterprise, and the like; asset total data includes, for example, liquidity, long-term investment, fixed asset, intangible and deferred asset, other long-term assets, and the like; enterprise listing data includes, for example, the listing type of the enterprise (e.g., off-market, repurposed, new three-board, subject board, motherboard, medium-sized, etc.).
Specifically, for example, the original data may be acquired by at least one of the following methods (i) to (iii):
the method comprises the following steps: and acquiring enterprise information data from the information of the listed financial reports, the network public data and the news by a crawler method.
Secondly, the step of: and acquiring enterprise information data by traversing the related information table of the database.
The business registration information of the enterprise is obtained from an official channel, and the enterprise information data is obtained according to the business registration information of the enterprise.
③: and enterprise information data fed back by the enterprise is obtained through the enterprise supplementary recording data channel.
Illustratively, necessary padding content prompt information and selected padding content prompt information are generated according to required enterprise information data; generating an enterprise additional recording data channel according to the necessary additional recording content prompt information and the selected additional recording content prompt information, and receiving enterprise information data fed back by an enterprise according to the enterprise additional recording data channel; the enterprise supplementary recording data channel includes, for example, a link, a web page, a web site, and the like.
For the above S102, the enterprise relationship data may be constructed from the raw data, for example, by any of the following methods (a) and (B):
a: cleaning the original data, and removing repeated enterprise information data and invalid enterprise information data in the original data; and constructing a model according to the cleaned original data and the pre-trained enterprise relationship data to construct enterprise relationship data.
Here, since the original data includes the enterprise information data acquired from each channel (e.g., database-related information table, network public resource, enterprise supplementary recording data channel), there may be duplicate enterprise information data in the original data. The credibility of the enterprise information data of different channels is also different, and the enterprise information data of the same type but different channels may be different, so that the enterprise information data of the channel with higher credibility needs to be reserved, and therefore the original data needs to be cleaned.
Specifically, the raw data may be cleaned, for example, by the following method: setting different weights for enterprise information data acquired by a database related information table, network public resources and an enterprise additional recording data channel respectively; comparing enterprise information data acquired from a database related information table, network public resources and an enterprise additional entry data channel, removing repeated enterprise information data, and retaining the enterprise information data of the same type but inconsistent with the enterprise information data with the highest weight in the type to obtain effective data; storing the effective data according to a preset data structure by using a word segmentation tool; the preset data structure, for example, predefines the division of the enterprise information data into different data categories.
For example, because the enterprise information data obtained by the enterprise supplementary recording data channel is input by the enterprise, the credibility is higher than that of the enterprise information data obtained from the network public resources; although the enterprise information data acquired from the relevant information table of the database is enterprise registered data, the credibility of the enterprise information data is higher than that of the enterprise information data acquired from network open resources, because the registration time is far away from the time of acquiring the original data, the enterprise information data of the enterprise cannot be updated in time when being changed, and therefore the credibility of the enterprise information data of the enterprise supplementary recording data channel is lower than that of the enterprise information data; on the basis, the weight set for the enterprise information data of the enterprise supplementary recording data channel is greater than the weight of the enterprise information data in the database related information table, and the weight set for the enterprise information data in the database related information table is greater than the weight of the enterprise information data acquired from the network public resources; when the enterprise information data acquired from the three channels are repeated, the enterprise information data of any one channel is reserved; when the enterprise information data of any one of the three channels is inconsistent, the enterprise information data of the channel with the highest weight, namely the enterprise information data of the enterprise supplementary recording data channel, is reserved, for example, when the business income data acquired from the enterprise supplementary recording data channel is different from the business income data acquired from the database related information table or the network public resources, the business income data acquired from the enterprise supplementary recording data channel is reserved.
Therefore, redundant and inaccurate enterprise information data in the original data can be removed by cleaning the original data, and convenience is brought to the generation of subsequent enterprise information tags.
In addition, the enterprise relational data construction model is a model for mining and establishing a relational structure between various enterprise data according to machine learning, and in another embodiment of the invention, enterprise information data are acquired from the internet through a crawler method; and carrying out relation data labeling on the enterprise information data to obtain a training sample, and training the enterprise relation data construction model by using the training sample.
Illustratively, enterprise information data of a plurality of sample enterprises are acquired from the Internet as sample data by using a crawler method; for each sample enterprise in the plurality of sample enterprises, processing sample data corresponding to each sample enterprise by using an enterprise relation data construction model to obtain enterprise relation data corresponding to each sample enterprise; determining the loss of the enterprise relation data construction model according to the enterprise relation data corresponding to each sample enterprise, and training the enterprise relation data model according to the loss; and obtaining the trained enterprise relational data model through multiple rounds of training of the enterprise relational data construction model.
Specifically, enterprise relational data is constructed according to the cleaned original data and a pre-trained enterprise relational data construction model, for example, the cleaned original data is classified by using the enterprise relational data construction model; and according to the classification result, performing level classification on the data in each class, and generating a subordinate relationship among the data to obtain enterprise relationship data.
Illustratively, enterprise information data in original data is divided into an enterprise stock control class, an enterprise business class and an enterprise investment class; the enterprise stock control class comprises enterprise shareholder data, shareholder stock holding proportion data and the like; the enterprise business class comprises enterprise business type data, incidence relation data among the business type data and the like; the enterprise investment class comprises project type data of enterprise investment, invested fund data of each project type and the like; taking an enterprise business class as an example, as shown in fig. 2, an enterprise relational data schematic diagram provided by the embodiment of the present invention is shown, where the enterprise business class includes two fields: a data type field, a data grade field; the enterprise comprises 12 business types from the business type 1 to the business type 12, wherein the business types have three grades, and the first grade comprises the business type 1 and the business type 2; the service type 1 of the first grade comprises a service type 3 and a service type 4 of the second grade; the first-level service type 2 comprises a first-level service type 5, a first-level service type 6 and a first-level service type 7; the service type 3 of the second grade comprises a service type 8 and a service type 9 of a third grade; the second class of traffic type 6 contains a third class of traffic type 10.
B: cleaning the original data and converting the original data into a preset format; and importing the cleaned and converted original data into an Oracle database, and constructing enterprise relational data by adopting a relational database technology based on the Oracle database.
Here, the manner of cleaning the original data is similar to that of cleaning the original data in the above-mentioned a, and is not described herein again; when the enterprise relational data is constructed by adopting the relational database technology based on the Oracle database, for example, the method comprises the steps of classifying the cleaned and converted original data into classes, generating a two-dimensional data table according to each class, and storing the data to the corresponding position of the two-dimensional data table according to the subordination relation among the data in each class to obtain the enterprise relational data.
As for the above S103, when the enterprise relational data is imported into the graph database to generate the enterprise relational graph, for example, importing the enterprise relational data into a Neo4j database, and generating a node and a node name corresponding to each data according to the classification result in the enterprise relational data and the data included in each class; and aiming at each category, connecting each node according to the grade of each data in the category and the affiliation among the data to obtain the enterprise relationship graph.
Exemplarily, as shown in fig. 3, for an enterprise relational graph diagram provided by an embodiment of the present invention, a highest-level node is a node corresponding to an enterprise identifier of the enterprise, a node name is enterprise a, a next-level node includes an enterprise stock control class, an enterprise service class, and an enterprise investment class, fig. 3 mainly shows in detail an enterprise service class as an example, the enterprise service class generates a corresponding node and a node name of each node for each service type according to enterprise relational data shown in fig. 2, and connects the nodes according to the levels and subordinates of the service types to obtain nodes under the enterprise service class node shown in fig. 3, the enterprise service class node is respectively connected to a service type 1 node and a service type 2 node, the service type 1 node is also respectively connected to a service type 3 node and a service type 4 node, the service type 2 node is also respectively connected to a service type 5 node, and a node, The service type 6 node and the service type 7 node are connected, and the service type 3 node is also connected with the service type 8 node and the service type 9 node respectively; the traffic type 6 node is also connected to a traffic type node 10.
In addition, the method for creating the enterprise relational data by adopting the relational database technology based on the Oracle database, importing the enterprise relational data into a graph database, and when generating the enterprise relational graph, further comprises the following steps: converting the enterprise relation data into a text data format by using an sqluldr2 tool, importing the converted enterprise relation data into a database Neo4j, and generating an enterprise relation map.
In the above S104, when generating the enterprise information label for the enterprise according to the enterprise relationship graph, for example, the enterprise information label of the node may be generated according to the node name of the node with the lowest level in each category and each subordinate relationship.
Specifically, when the enterprise information tag of the node is generated according to the node name of the node with the lowest level in each category and each subordination relationship, for example, the node name may be directly used as the tag name, or according to a predefined tag definition rule, which enterprise information tag definition is met according to the node name with the lowest level is determined, so as to generate the corresponding enterprise information tag.
To introduce the enterprise information tag generation method according to the embodiment of the present invention more clearly, as shown in fig. 4, a specific example diagram of an enterprise information tag generation method according to the embodiment of the present invention includes:
s401: acquiring original data from network public resources, database related information tables and enterprise supplementary recording data channels;
s402: cleaning the original data: (1) removing repeated and wrong enterprise information data in the original data to obtain effective data; (2) storing the effective data according to a preset data structure by using a word segmentation tool;
s403: according to the cleaned original data, utilizing a pre-trained enterprise relational data construction model or a relational database technology based on an Oracle database to construct enterprise relational data;
s404: importing the enterprise relation data into a Neo4j database to generate an enterprise relation map;
s405: and generating enterprise information labels for the enterprises according to the enterprise relation maps.
The enterprise information tag generation method provided by the embodiment of the invention is suitable for various scenes in which enterprise information data are required to be classified and screened, can generate personalized enterprise information tags according to different scene requirements, and provides convenience for screening effective enterprise information in various scenes. For example, when screening a suitable complementary policy for an enterprise, enterprise information data such as the scale, category, and winning condition of the enterprise need to be analyzed and sorted, and the following description will take an example in which the enterprise information tag generation method provided in the embodiment of the present invention is applied to a scenario in which an enterprise information tag is generated for the complementary policy.
When generating enterprise information labels for enterprises aiming at the comet supply policy:
step 1: cleaning and classifying the original data according to a comet-benefit policy to obtain effective classified data; constructing enterprise relation data according to the effective classification data; the enterprise relationship data comprises enterprise scale relationship data and enterprise shareholder relationship data.
Here, the enterprise scale relationship data includes, for example, practitioner data, revenue data, total assets data of the enterprise, and the enterprise shareholder relationship data includes, for example, shareholder data, shareholder control stock route data, and shareholder benefit ratio data of the enterprise.
Step 2: importing the enterprise relationship data into a graph database, and generating an enterprise relationship graph according to the enterprise scale relationship data and the enterprise shareholder relationship data; the enterprise relationship map comprises an enterprise shareholder relationship map and an enterprise scale relationship map.
Illustratively, generating corresponding nodes and node names according to shareholder data and shareholder benefit ratio data of the enterprise, and connecting each node according to shareholder stock control route data to obtain an enterprise shareholder relation graph; and generating nodes corresponding to the enterprise scale categories, generating nodes corresponding to the practitioner data, the business income data and the total asset data, and respectively connecting the nodes corresponding to the practitioner data, the business income data and the total asset data with the nodes corresponding to the enterprise scale categories to obtain an enterprise scale relational graph.
And step 3: generating enterprise category labels of the enterprises according to the enterprise shareholder relationship graph; wherein, the enterprise category class label includes: headquarters enterprises, growth enterprises.
For example, as shown in fig. 5, a schematic diagram of an enterprise information tag based on a complementary policy and a corresponding tag definition rule according to an embodiment of the present invention is provided, where the enterprise category class tag is a primary tag, and the enterprise category class tag includes two secondary tags, namely, a headquarter enterprise and a growth enterprise; the label definition rule of the headquarter enterprise is that the final beneficiary of the enterprise is the final beneficiary of a plurality of enterprises; the label definition rule of the growth label is that the enterprise is determined to meet the requirements of the growth enterprise according to a two-dimensional judgment method; therefore, when the enterprise category labels of the enterprise are generated according to the enterprise shareholder relationship diagram, for example, according to shareholder data corresponding to the lowest-level node in the shareholder relationship diagram, that is, the final beneficiary of the enterprise, whether the final beneficiary is the final beneficiary of multiple enterprises is determined, when the final beneficiary is the final beneficiary of multiple enterprises, a headquarter enterprise label is generated for the enterprise, when the final beneficiary is only the final beneficiary of one enterprise, and when the enterprise is determined to be a growing enterprise according to the two-dimensional judgment method, a growing enterprise label is generated for the enterprise.
And 4, step 4: generating an enterprise scale class label of the enterprise according to the enterprise scale relation graph; wherein, the enterprise stock size class label comprises: large-scale enterprises, medium-scale enterprises, small-scale enterprises and micro enterprises.
Illustratively, as shown in fig. 5, the enterprise-scale class tags are primary tags, including four secondary tags, namely, a large enterprise, a medium enterprise, a small enterprise and a micro enterprise, the enterprise-scale division is determined according to the rules of the "medium-small enterprise division standard rule" formulated by the ministry of industry and informatization, the national statistical bureau, the development reform committee, the finance department in 2011 and the "statistical large-medium small-micro enterprise division method (2017)" issued by the national statistical bureau, and three aspects of practitioner data, business income data, total amount of assets data and the like of the enterprise are taken as the standards for measuring the enterprise-scale, so as shown in fig. 5, the tag definition rule of the large enterprise is determined according to the ministry of industry and informatization department, the national statistical bureau, the development reform committee, The financial department sets out the standard regulation of dividing small and medium-sized enterprises in 2011 and the regulation of a dividing method (2017) of statistically large, medium and small and miniature enterprises issued by the State statistics bureau, and the three indexes of practitioner data, business income data and total asset data meet the requirements of large enterprises according to the industry to which the enterprises belong; the label definition rule of the medium-sized enterprise is that according to the provision of 'small and medium-sized enterprise division standard provision' formulated by the ministry of industry and informatization, State statistical administration, development reform Commission, finance department in 2011 and the provision of 'statistical large, medium-sized and small enterprise division method (2017)' issued by the State statistical administration, the three indexes of practitioner data, business income data and total asset data all meet the requirements of the medium-sized enterprise according to the industry to which the enterprise belongs; the label definition rule of the small-sized enterprise is that according to the stipulations of 'small and medium-sized enterprise division standard stipulation' formulated by the ministry of industry and informatization, State statistical administration, development reform Commission, finance department in 2011 and 'statistical large, medium-sized and small-sized enterprise division method' (2017) issued by the State statistical administration, the three indexes of practitioner data, business income data and total asset data all meet the requirements of the small-sized enterprise according to the industry to which the enterprise belongs; the label definition rule of the micro enterprise is that according to the stipulations of 'small and medium enterprise division standard stipulations' formulated by the ministry of industry and informatization, State statistical administration, development reform Commission, finance department in 2011 and 'statistical large, medium and small micro enterprise division method' (2017) issued by the State statistical administration, any index of practitioner data, business income data and total asset data meets the requirement of the micro enterprise according to the industry to which the enterprise belongs.
Therefore, when generating an enterprise scale class label of an enterprise according to the enterprise scale relationship diagram, for example, according to nodes corresponding to practitioner data, business income data and total asset data in the enterprise scale relationship diagram, determining which label definition rule about a large-scale enterprise, a medium-scale enterprise, a small-scale enterprise and a micro-scale enterprise the enterprise scale of the enterprise meets a preset label definition rule, and generating a corresponding label, for example, when the enterprise scale meets the definition of the large-scale enterprise, generating a large-scale enterprise label; when the enterprise scale meets the definition of a medium-sized enterprise, generating a medium-sized enterprise label; when the enterprise scale meets the definition of the small enterprise, generating a small enterprise label; and when the enterprise size meets the definition of the micro enterprise, generating a micro enterprise label.
In addition, the preset label definition rules may further include some specific enterprise lists in the database related information table (for example, enterprise lists recorded for various types of marketing and enterprise winning lists for various awards), determine whether an enterprise exists in a certain enterprise list according to a node corresponding to an enterprise identifier in the enterprise relationship graph, and if so, generate a corresponding enterprise information label for the enterprise, as shown in fig. 5, the primary label further includes a type of marketing and an obtaining bonus class, the label definition rule of the type of marketing generates a corresponding type of marketing label for an enterprise in the enterprise list recorded for various types of marketing, and the label definition rule of the obtaining bonus class generates a label corresponding to an award for an enterprise in the enterprise winning list of various awards.
Here, the listing type includes, for example, off-market, new three-board listing, mother board listing, middle board listing, and the like; the enterprise prize list includes, for example: a market-level high and new technology enterprise filing list, a national-level high and new technology enterprise list, a science and technology small giant leader military enterprise list and the like.
Therefore, the corresponding enterprise information labels are generated for the enterprises according to the intelligent compensation policy, and when the enterprises are screened to be suitable in the intelligent compensation policy, whether the enterprises meet the requirements of the intelligent compensation policy or not can be judged according to the information labels of the enterprises, so that the information screening efficiency is improved.
The embodiment of the invention also provides an enterprise information label generating device, which is described in the following embodiment. Because the principle of the device for solving the problems is similar to the enterprise information tag generation method, the implementation of the device can refer to the implementation of the enterprise information tag generation method, and repeated parts are not described again.
As shown in fig. 6, a schematic diagram of an enterprise information tag generating apparatus provided in an embodiment of the present invention includes: an obtaining module 601, a constructing module 602, a first processing module 603, and a second processing module 604; wherein the content of the first and second substances,
an obtaining module 601, configured to obtain original data, where the original data includes enterprise information data obtained from a database related information table, a network public resource, and an enterprise supplementary recording data channel;
a construction module 602, configured to construct enterprise relationship data according to the original data;
the first processing module 603 is configured to import the enterprise relationship data into a graph database, and generate an enterprise relationship graph;
and a second processing module 604, configured to generate enterprise information tags for the enterprises according to the enterprise relationship maps.
In one possible embodiment, the enterprise information data includes at least one of: enterprise shareholder data, enterprise practitioner data, business income data, total asset data, and enterprise listing data.
In one possible implementation manner, the acquisition module is specifically configured to acquire enterprise information data from listed financial newspaper information, network public data and news through a crawler method; acquiring enterprise information data by traversing the related information table of the database; and enterprise information data fed back by the enterprise is obtained through the enterprise supplementary recording data channel.
In a possible implementation manner, the obtaining module is specifically configured to generate necessary padding entry content prompt information and select the padding entry content prompt information according to the required enterprise information data; and generating an enterprise additional recording data channel according to the necessary additional recording content prompt information and the selected additional recording content prompt information, and receiving enterprise information data fed back by the enterprise according to the enterprise additional recording data channel.
In a possible implementation manner, the construction module is specifically configured to clean original data, and remove repeated enterprise information data and invalid enterprise information data in the original data; constructing a model according to the cleaned original data and the pre-trained enterprise relationship data to construct enterprise relationship data; the enterprise relational data construction model is a model for mining and establishing a relational structure among various enterprise data according to machine learning.
In one possible embodiment, the method further comprises: the third processing module is used for acquiring enterprise information data from the Internet by a crawler method; and carrying out relation data labeling on the enterprise information data to obtain a training sample, and training the enterprise relation data construction model by using the training sample.
In a possible implementation manner, the construction module is specifically configured to set different weights for enterprise information data acquired by a database related information table, a network public resource, and an enterprise supplementary recording data channel, respectively; comparing enterprise information data acquired from a database related information table, network public resources and an enterprise additional entry data channel, removing repeated enterprise information data, and retaining the enterprise information data of the same type but inconsistent with the enterprise information data with the highest weight in the type to obtain effective data; and storing the effective data according to a preset data structure by using a word segmentation tool.
In a possible implementation manner, the construction module is specifically configured to classify the cleaned original data by using an enterprise relational data construction model; and according to the classification result, performing level classification on the data in each class, and generating a subordinate relationship among the data to obtain enterprise relationship data.
In a possible implementation manner, the first processing module is specifically configured to import the enterprise relationship data into a Neo4j database, and generate a node and a node name corresponding to each data according to a classification result in the enterprise relationship data and data included in each class; and aiming at each category, connecting each node according to the grade of each data in the category and the affiliation among the data to obtain the enterprise relationship graph.
In a possible implementation manner, the second processing module is specifically configured to generate an enterprise information tag of a node according to a node name of a node with a lowest level in each category and each dependency relationship.
In a possible implementation manner, the construction module is specifically configured to clean the original data and convert the original data into a preset format; and importing the cleaned and converted original data into an Oracle database, and constructing enterprise relational data by adopting a relational database technology based on the Oracle database.
In a possible implementation manner, the first processing module is specifically configured to convert the enterprise relationship data into a text data format by using an sql ildr 2 tool, import the converted enterprise relationship data into a database Neo4j, and generate an enterprise relationship graph.
In a possible implementation manner, the construction module is specifically configured to perform cleaning and classification on original data according to a coma compensation policy to obtain effective classification data; constructing enterprise relation data according to the effective classification data; the enterprise relation data comprises enterprise scale relation data and enterprise shareholder relation data; the first processing module is specifically used for importing the enterprise relationship data into a graph database and generating an enterprise relationship graph according to the enterprise scale relationship data and the enterprise shareholder relationship data; the enterprise relationship map comprises an enterprise shareholder relationship map and an enterprise scale relationship map; the second processing module is specifically used for generating enterprise category labels of the enterprises according to the enterprise shareholder relationship diagram; wherein, the enterprise category class label includes: headquarters enterprises, growth enterprises; generating an enterprise scale class label of the enterprise according to the enterprise scale relation graph; wherein, the enterprise stock size class label comprises: large-scale enterprises, medium-scale enterprises, small-scale enterprises and micro enterprises.
Based on the aforementioned inventive concept, as shown in fig. 7, the present invention further provides a computer device 700, which includes a memory 710, a processor 720, and a computer program 730 stored on the memory 710 and operable on the processor 720, wherein the processor 720 implements the aforementioned enterprise information tag generation method when executing the computer program 730.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program for executing the above-mentioned enterprise information tag generation method is stored in the computer-readable storage medium.
In the embodiment of the invention, original data is acquired, wherein the original data comprises enterprise information data acquired from a database related information table, network public resources and an enterprise additional data channel; according to the original data, establishing enterprise relation data; importing the enterprise relationship data into a graph database to generate an enterprise relationship graph; compared with the technical scheme that the enterprise information data of the enterprise are artificially summarized and sorted to extract effective information and consume a large amount of manpower and material resources in the prior art, the enterprise information labels can be generated according to the enterprise relation maps, the effective information in the enterprise information data can be rapidly and visually known according to the enterprise information labels, the enterprise information data can be summarized, and the efficiency of obtaining the effective information in the enterprise information data is improved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (25)

1. An enterprise information tag generation method is characterized by comprising the following steps:
acquiring original data, wherein the original data comprises enterprise information data acquired from a database related information table, network public resources and an enterprise supplementary recording data channel;
according to the original data, establishing enterprise relation data;
importing the enterprise relationship data into a graph database to generate an enterprise relationship graph;
and generating enterprise information labels for the enterprises according to the enterprise relation maps.
2. The method of claim 1, wherein the enterprise information data comprises at least one of: enterprise shareholder data, enterprise practitioner data, business income data, total asset data, and enterprise listing data.
3. The method of generating an enterprise information tag of claim 1, wherein obtaining raw data comprises:
acquiring enterprise information data from listed financial newspaper information, network public data and news by a crawler method;
acquiring enterprise information data by traversing the related information table of the database;
and enterprise information data fed back by the enterprise is obtained through the enterprise supplementary recording data channel.
4. The method of claim 3, wherein obtaining the enterprise information data fed back by the enterprise through the enterprise supplementary recording data channel comprises:
generating necessary filling record content prompt information and selecting filling record content prompt information according to the required enterprise information data;
and generating an enterprise additional recording data channel according to the necessary additional recording content prompt information and the selected additional recording content prompt information, and receiving enterprise information data fed back by the enterprise according to the enterprise additional recording data channel.
5. The method of claim 1, wherein constructing business relationship data from the raw data comprises:
cleaning the original data, and removing repeated enterprise information data and invalid enterprise information data in the original data;
constructing a model according to the cleaned original data and the pre-trained enterprise relationship data to construct enterprise relationship data; the enterprise relational data construction model is a model for mining and establishing a relational structure among various enterprise data according to machine learning.
6. The method of generating an enterprise information tag of claim 5, further comprising:
acquiring enterprise information data from the Internet by a crawler method;
and carrying out relation data labeling on the enterprise information data to obtain a training sample, and training the enterprise relation data construction model by using the training sample.
7. The method for generating the enterprise information label according to claim 5, wherein the cleaning of the original data to remove the repeated enterprise information data and other data except the enterprise information data in the original data comprises:
setting different weights for enterprise information data acquired by a database related information table, network public resources and an enterprise additional recording data channel respectively;
comparing enterprise information data acquired from a database related information table, network public resources and an enterprise additional entry data channel, removing repeated enterprise information data, and retaining the enterprise information data of the same type but inconsistent with the enterprise information data with the highest weight in the type to obtain effective data;
and storing the effective data according to a preset data structure by using a word segmentation tool.
8. The method of claim 5, wherein constructing the business relationship data according to the cleaned original data and the pre-trained business relationship data construction model construction business relationship data comprises:
classifying the cleaned original data by utilizing an enterprise relational data construction model;
and according to the classification result, performing level classification on the data in each class, and generating a subordinate relationship among the data to obtain enterprise relationship data.
9. The method of claim 8, wherein importing the business relationship data into a graph database to generate a business relationship graph comprises:
importing the enterprise relational data into a Neo4j database, and generating a node corresponding to each data and a node name according to a category division result in the enterprise relational data and data contained in each category;
and aiming at each category, connecting each node according to the grade of each data in the category and the affiliation among the data to obtain the enterprise relationship graph.
10. The method of claim 9, wherein generating enterprise information labels for an enterprise according to an enterprise relationship graph comprises:
and generating the enterprise information label of the node according to the node name of the node with the lowest level under each category and each subordination relation.
11. The method of claim 1, wherein constructing business relationship data from the raw data comprises:
cleaning the original data and converting the original data into a preset format;
and importing the cleaned and converted original data into an Oracle database, and constructing enterprise relational data by adopting a relational database technology based on the Oracle database.
12. The method of claim 11, wherein importing the business relationship data into a graph database to generate a business relationship graph comprises:
converting the enterprise relation data into a text data format by using an sqluldr2 tool, importing the converted enterprise relation data into a database Neo4j, and generating an enterprise relation map.
13. The method of claim 1, wherein constructing business relationship data from the raw data comprises:
cleaning and classifying the original data according to a comet-benefit policy to obtain effective classified data;
constructing enterprise relation data according to the effective classification data; the enterprise relation data comprises enterprise scale relation data and enterprise shareholder relation data;
importing the enterprise relationship data into a graph database to generate an enterprise relationship graph, wherein the method comprises the following steps:
importing the enterprise relationship data into a graph database, and generating an enterprise relationship graph according to the enterprise scale relationship data and the enterprise shareholder relationship data; the enterprise relationship map comprises an enterprise shareholder relationship map and an enterprise scale relationship map;
generating enterprise information labels for the enterprises according to the enterprise relationship maps, wherein the method comprises the following steps:
generating enterprise category labels of the enterprises according to the enterprise shareholder relationship graph; wherein, the enterprise category class label includes: headquarters enterprises, growth enterprises;
generating an enterprise scale class label of the enterprise according to the enterprise scale relation graph; wherein, the enterprise stock size class label comprises: large-scale enterprises, medium-scale enterprises, small-scale enterprises and micro enterprises.
14. An enterprise information tag generation apparatus, comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring original data, and the original data comprises enterprise information data acquired from a database related information table, network public resources and an enterprise additional data channel;
the construction module is used for constructing enterprise relation data according to the original data;
the first processing module is used for importing the enterprise relationship data into a graph database to generate an enterprise relationship graph;
and the second processing module is used for generating enterprise information labels for the enterprises according to the enterprise relation maps.
15. The apparatus according to claim 14, wherein the acquiring module is specifically configured to acquire the enterprise information data from the listed financial newspaper information, the network public data, and the news by using a crawler method; acquiring enterprise information data by traversing the related information table of the database; and enterprise information data fed back by the enterprise is obtained through the enterprise supplementary recording data channel.
16. The apparatus according to claim 15, wherein the obtaining module is specifically configured to generate necessary padding entry content prompt information and select padding entry content prompt information according to the required enterprise information data; and generating an enterprise additional recording data channel according to the necessary additional recording content prompt information and the selected additional recording content prompt information, and receiving enterprise information data fed back by the enterprise according to the enterprise additional recording data channel.
17. The apparatus according to claim 14, wherein the construction module is specifically configured to clean the original data, and remove duplicate enterprise information data and invalid enterprise information data from the original data;
constructing a model according to the cleaned original data and the pre-trained enterprise relationship data to construct enterprise relationship data; the enterprise relational data construction model is a model for mining and establishing a relational structure among various enterprise data according to machine learning.
18. The apparatus of claim 17, further comprising:
the third processing module is used for acquiring enterprise information data from the Internet by a crawler method;
and carrying out relation data labeling on the enterprise information data to obtain a training sample, and training the enterprise relation data construction model by using the training sample.
19. The apparatus according to claim 17, wherein the building module is specifically configured to set different weights for the enterprise information data obtained from the database related information table, the network public resource, and the enterprise supplementary recording data channel;
comparing enterprise information data acquired from a database related information table, network public resources and an enterprise additional entry data channel, removing repeated enterprise information data, and retaining the enterprise information data of the same type but inconsistent with the enterprise information data with the highest weight in the type to obtain effective data;
and storing the effective data according to a preset data structure by using a word segmentation tool.
20. The apparatus according to claim 17, wherein the construction module is specifically configured to classify the cleaned raw data by using an enterprise relational data construction model;
and according to the classification result, performing level classification on the data in each class, and generating a subordinate relationship among the data to obtain enterprise relationship data.
21. The apparatus according to claim 20, wherein the first processing module is specifically configured to import the business relationship data into a Neo4j database, and generate a node and a node name corresponding to each piece of data according to a classification result in the business relationship data and data included in each class;
and aiming at each category, connecting each node according to the grade of each data in the category and the affiliation among the data to obtain the enterprise relationship graph.
22. The apparatus according to claim 21, wherein the second processing module is specifically configured to generate the enterprise information label of the node according to the node name of the node with the lowest level in each category and each dependency relationship.
23. The apparatus according to claim 14, wherein the construction module is specifically configured to clean the raw data and convert the raw data into a preset format;
and importing the cleaned and converted original data into an Oracle database, and constructing enterprise relational data by adopting a relational database technology based on the Oracle database.
24. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the enterprise information tag generation method of any one of claims 1 to 13 when executing the computer program.
25. A computer-readable storage medium, wherein a computer program for executing the method for generating an enterprise information tag according to any one of claims 1 to 13 is stored in the computer-readable storage medium.
CN202110744350.9A 2021-06-30 2021-06-30 Enterprise information tag generation method and device Pending CN113485987A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110744350.9A CN113485987A (en) 2021-06-30 2021-06-30 Enterprise information tag generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110744350.9A CN113485987A (en) 2021-06-30 2021-06-30 Enterprise information tag generation method and device

Publications (1)

Publication Number Publication Date
CN113485987A true CN113485987A (en) 2021-10-08

Family

ID=77937417

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110744350.9A Pending CN113485987A (en) 2021-06-30 2021-06-30 Enterprise information tag generation method and device

Country Status (1)

Country Link
CN (1) CN113485987A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115310609A (en) * 2022-10-10 2022-11-08 中信证券股份有限公司 Method, device and related equipment for constructing derivative guarantee map

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115310609A (en) * 2022-10-10 2022-11-08 中信证券股份有限公司 Method, device and related equipment for constructing derivative guarantee map

Similar Documents

Publication Publication Date Title
Fahimnia et al. Quantitative models for managing supply chain risks: A review
Moya‐Anegón et al. Visualizing the marrow of science
Sánchez-Torres et al. The role of future-oriented technology analysis in e-Government: a systematic review
Leydesdorff et al. Generating clustered journal maps: an automated system for hierarchical classification
CN108648123A (en) A method of its management network teaching process of the network teaching platform and utilization based on big data
CN112182246B (en) Method, system, medium, and application for creating an enterprise representation through big data analysis
Papagiannidis et al. Identifying industrial clusters with a novel big-data methodology: Are SIC codes (not) fit for purpose in the Internet age?
CN1347529A (en) Method for visualizing information in data warehousing environment
Alipour-Vaezi et al. Introducing a novel revenue-sharing contract in media supply chain management using data mining and multi-criteria decision-making methods
Beelen et al. Bias and representativeness in digitized newspaper collections: Introducing the environmental scan
Chen et al. Exploring technology opportunities and evolution of IoT-related logistics services with text mining
Paiho et al. Opportunities of collected city data for smart cities
Berko et al. Features of information resources processing in electronic content commerce
CN113485987A (en) Enterprise information tag generation method and device
Puuska et al. Proof of concept of a European database for social sciences and humanities publications: Description of the VIRTA-ENRESSH pilot
Ojo et al. The segmentation of local government areas: Creating a new geography of nigeria
Schmidt et al. What is the role of data in jobs in the United Kingdom, Canada, and the United States?: A natural language processing approach
CN113971213A (en) Smart city management public information sharing system
Goyal et al. The prevalence of big data analytics in public policy: is there a research-pedagogy gap?
Junlabuddee et al. Analysis of research data in information science using the topic modeling method
Khanom et al. The News Crawler: A Big Data Approach to Local Information Ecosystems
CN111460300A (en) Network content pushing method and device and storage medium
KR101201218B1 (en) Method on Patent Information Processing Supporting Discovery of Niche Technology Area
Shapovalov et al. Conference platform metadata and functions: existing platforms analysis and ontology-based approach
Kastrin et al. Methodologies and applications for resilient global development from the aspect of SDI-SOR special issues of CJOR

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination