CN117391440A - Enterprise information reconnaissance platform and method - Google Patents

Enterprise information reconnaissance platform and method Download PDF

Info

Publication number
CN117391440A
CN117391440A CN202311322724.3A CN202311322724A CN117391440A CN 117391440 A CN117391440 A CN 117391440A CN 202311322724 A CN202311322724 A CN 202311322724A CN 117391440 A CN117391440 A CN 117391440A
Authority
CN
China
Prior art keywords
data
enterprise
model
information
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311322724.3A
Other languages
Chinese (zh)
Inventor
施荣芳
林大鹏
吴爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Datacom Corp ltd
Original Assignee
China Datacom Corp ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Datacom Corp ltd filed Critical China Datacom Corp ltd
Priority to CN202311322724.3A priority Critical patent/CN117391440A/en
Publication of CN117391440A publication Critical patent/CN117391440A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Marketing (AREA)
  • Finance (AREA)
  • Game Theory and Decision Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an enterprise information reconnaissance platform and method, comprising expanding a data source based on initial enterprise client data, introducing business cooperation data, collecting website data of credit investigation, vertical portals and enterprise news, and obtaining target enterprise data through data planning processing; based on target enterprise data, integrating enterprise business opportunity information to construct a market space mining model; carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result; and carrying out risk judgment and early warning according to the prediction result, obtaining the abnormal condition of enterprise operation, and outputting a visual abnormal result and an analysis report. The reconnaissance platform is built based on a lake and warehouse integrated technology and a flow batch integrated technology, can better meet the analysis requirement of enterprise data, provides comprehensive data insight, supports real-time decision making and data-driven business development, and simultaneously has the advantages of flexibility, comprehensiveness and performance.

Description

Enterprise information reconnaissance platform and method
Technical Field
The invention belongs to the field of information reconnaissance, and particularly relates to an enterprise information reconnaissance platform and method.
Background
The enterprise information reconnaissance platform belongs to a big data platform, and research and development of the big data platform in the current market presents diversity and universality in technical application, but has more or less following defects in data security, real-time data integration benefit, data processing and analysis performance, algorithm and model:
1. data security problem: the platform may store sensitive information, which if improperly protected, may result in data leakage. While improper data access rights settings may result in unauthorized persons accessing sensitive data.
2. Real-time data integration problem: the performance requirements for real-time data stream integration and batch data integration are different. Real-time streaming requires low latency and high throughput, while batch processing is more focused on efficient large-scale data analysis. Most of the current platforms cannot achieve good balance between meeting the two requirements, the situation that the two requirements are out of phase occurs, the requirements of real-time application programs cannot be well met, and batch data cannot be processed efficiently.
3. Data processing and analysis problems: the platform requires a large amount of computing resources, potentially resulting in a slow analysis. At the same time, integrating data from different data sources into a platform can be very complex, requiring multiple issues of data formats, protocols, etc., and supporting different types of data, including structured, semi-structured, and unstructured data, which all require complex processing methods.
4. Algorithm and model problems: most data platforms require users or developers to manually write appropriate model scripts, which requires extensive field knowledge and experience by the operator, which is a significant challenge for non-professional data analysts. The platform does not provide an automated data analysis model, which also results in poor performance and inefficiency, and the use of an improper model also results in inaccurate analysis results.
Disclosure of Invention
In order to solve the problems, the invention provides the following scheme: an enterprise information reconnaissance platform and a method. Wherein, an enterprise information reconnaissance platform includes:
the data acquisition module is used for expanding a data source based on initial enterprise client data, introducing business cooperation data and simultaneously collecting website data of credit investigation, vertical portals and enterprise news;
the data processing module is connected with the data acquisition module and is used for carrying out data planning processing on the expansion data to obtain target enterprise data;
the model construction module is connected with the data processing module and used for fusing enterprise business opportunity information based on the target enterprise data to construct a market space mining model;
the model prediction module is connected with the model construction module and is used for carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
the risk prediction module is connected with the model prediction module and used for carrying out risk judgment and early warning according to the prediction result, obtaining abnormal conditions of enterprise operation and outputting a visual risk analysis result and an analysis report.
Preferably, the data processing module comprises a data aggregation unit, a data cleaning and integrating unit, a data storage unit and an interface design unit;
the data aggregation unit is used for importing the data of the relational database into the Hadoop or the HIVE through the SQOOP command by utilizing the SQOOP component; the system is also used for processing serial and parallel tasks of data import by adopting an OOZIE component;
when the data cleaning and integrating unit is used for importing data, a Hadoop component, a Hive component and a SPARK component are adopted to carry out consistency check, invalid value and missing value processing, repeated data processing and data standardization processing on dirty data containing useless information;
the data storage unit is used for storing the processed structured data into the corresponding database through interfaces of the SPARK calling ORACLE, the MYSQL relational database and the memory database REDIS;
the interface design unit is used for defining the endpoint and the data interaction mode of the API by using RESTful design principle and defining the data format of the API by using JSON or XML.
Preferably, the data storage unit comprises a data lake and a data warehouse;
the data lake is used for storing original and unprocessed data, including structured, semi-structured and unstructured data;
the data warehouse is used for storing data subjected to cleaning, conversion and arrangement so as to support high-performance batch analysis.
Preferably, the market space mining model comprises a blue sea market space mining model and an stock market space mining model;
the blue sea market space mining model comprises a business bid analysis model, a gateway enterprise model and a newly-built enterprise mining model;
the stock market space mining model comprises a demand mining model and a service upgrading model.
Preferably, the model prediction module comprises an information base construction unit, a business machine mining unit and a prediction unit;
the information base construction unit is used for creating database tables according to bidding information of imported enterprise data, then designing a data model, determining the relation between the tables and obtaining a market bidding information base;
the business machine mining unit is used for building a market space big data mining model library based on the market bidding information base and combining the client information of the local network stock enterprises, and mining the development space of blue sea market, stock market and other network market government enterprise business;
after market space mining, the prediction unit is used for visually displaying and exporting the screened enterprise data and business matching reports to a file according to the needs of a user.
Preferably, the risk prediction module comprises an information change monitoring unit, a negative information detection unit, a negative information collection unit, a risk early warning unit and a visual output unit;
the information change monitoring unit is used for monitoring and analyzing the real-time change of the data flow in the data lake by applying a real-time data flow processing technology and obtaining change information of each dimension of an enterprise;
the negative information detection unit is used for integrating different algorithms of keyword matching, topic modeling, word embedding, emotion analysis, rule engine and supervised learning to integrate analysis results of the market space mining model;
the negative information collecting unit is used for automatically selecting a trained model, applying the trained model to text data to be classified, and classifying the text data into different negative information categories;
the risk early warning unit is used for notifying related personnel through an email and a short message when the enterprise has risks;
the visual output unit is used for visually displaying the risk analysis result and exporting a risk analysis report.
The invention also provides an enterprise information reconnaissance method, which comprises the following steps:
expanding a data source based on initial enterprise client data, introducing business cooperation data, collecting website data of credit, vertical portals and enterprise news, and obtaining target enterprise data through data planning processing;
based on the target enterprise data, fusing enterprise business opportunity information and constructing a market space mining model; carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
and carrying out risk judgment and early warning according to the prediction result, obtaining the abnormal condition of enterprise operation, and outputting a visual abnormal result and an analysis report.
Preferably, the process of obtaining the target enterprise data through the data planning process includes,
the SQOOP component is utilized to import the data of the relational database into the HADOOP or HIVE through the SQOOP command, and the OOZIE component is utilized to process serial and parallel tasks of data import;
when data is imported, a Hadoop component, a Hive component and a SPARK component are adopted to carry out consistency check, invalid value and missing value processing, repeated data processing and data standardization processing on dirty data containing useless information;
the processed structured data is stored into a corresponding database through interfaces of a SPARK calling ORACLE, MYSQL and other relational databases and a memory database REDIS;
the data service interface is designed, the RESTful design principle is used for defining the end points and the data interaction modes of the API, and the JSON or XML is used for defining the data format of the API.
Preferably, the enterprise business condition analysis and business opportunity mining are carried out according to the market space mining model, and the process of obtaining the prediction result comprises,
creating database tables according to bidding information of imported enterprise data, designing a data model, determining the relation between the tables, and obtaining a market bidding information base;
based on the market bidding information base, building a market space big data mining model base by combining the client information of the local network stock enterprises, and mining the development space of blue sea market, stock market and other network market government enterprise services;
after market space mining is carried out through the market space mining model, a user visually displays and exports the screened enterprise data and business matching report to a file according to the needs.
Preferably, the risk judging and early warning process according to the prediction result comprises the steps of,
the method comprises the steps of applying a real-time data stream processing technology to monitor and analyze real-time change of data streams in a data lake and obtaining change information of each dimension of an enterprise;
integrating different algorithms of keyword matching, topic modeling, word embedding, emotion analysis, rule engine and supervised learning, and integrating analysis results of the market space mining model;
automatically selecting a trained model, applying the trained module to text data to be classified, and classifying the text data into different negative information categories;
when the enterprise has risks, notifying related personnel through emails and short messages;
and visually displaying the risk analysis result as required, and exporting a risk analysis report.
Compared with the prior art, the invention has the following advantages and technical effects:
1. based on the integrated technology of the lake and the warehouse, the data warehouse and the data lake are opened, and the data lake is used for storing original and unprocessed data, including structured, semi-structured and unstructured data. The data warehouse is used to store cleaned, converted, and consolidated data to support high performance batch analysis.
2. Apache Kafka and Apache Flink are used as a streaming batch integrated processing engine to support real-time and batch data processing. Capturing, processing and analyzing real-time data streams in real time at a stream processing layer, supporting low-delay real-time decisions; performing large-scale offline data analysis at the batch layer supports complex data mining and report generation. The real-time data stream integration and batch processing data achieve better balance.
3. The platform integrates a plurality of different algorithms such as keyword matching, topic modeling, word embedding, emotion analysis, rule engines, supervised learning and the like, and integrates analysis results of all models, so that identification accuracy of negative information is greatly improved.
4. Aiming at enterprise data characteristics, the platform presets various data analysis models around different market space mining targets, and an operator can call a proper model only by simple parameter setting through the model setting panel to realize complex data analysis. The professional knowledge threshold of the user is greatly reduced, and the data analysis performance and efficiency are improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application, illustrate and explain the application and are not to be construed as limiting the application. In the drawings:
FIG. 1 is a schematic diagram of a system architecture according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method according to an embodiment of the invention.
Detailed Description
It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
Example 1
As shown in fig. 1, the enterprise information reconnaissance platform provided by the present invention includes:
the data acquisition module is used for expanding a data source based on initial enterprise client data, introducing business cooperation data and simultaneously collecting website data of credit investigation, vertical portals and enterprise news;
the data processing module is connected with the data acquisition module and is used for carrying out data planning processing on the expanded data to obtain target enterprise data;
the model construction module is connected with the data processing module and used for fusing enterprise business opportunity information based on target enterprise data to construct a market space mining model;
the model prediction module is connected with the model construction module and is used for carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
the risk prediction module is connected with the model prediction module and is used for carrying out risk judgment and early warning according to the prediction result, obtaining the abnormal condition of enterprise operation and outputting a visual risk analysis result and an analysis report.
Further, the data processing module comprises a data aggregation unit, a data cleaning and integrating unit, a data storage unit and an interface design unit;
the data aggregation unit is used for importing the data of the relational database into the HADOOP or HIVE through the SQOOP command by utilizing the SQOOP component; the system is also used for processing serial and parallel tasks of data import by adopting an OOZIE component;
when the data cleaning and integrating unit is used for importing data, a Hadoop component, a Hive component and a SPARK component are adopted to carry out consistency check, invalid value and missing value processing, repeated data processing and data standardization processing on dirty data containing useless information;
the data storage unit is used for storing the processed structured data into the corresponding database through interfaces of the SPARK calling ORACLE, the MYSQL relational database and the memory database REDIS;
the interface design unit is used for defining the endpoint and the data interaction mode of the API by using RESTful design principle and defining the data format of the API by using JSON or XML, and the version number is included in the URL in consideration of API version control.
Further, the data storage unit comprises a data lake and a data warehouse;
the data lake is used to store raw, unprocessed data, including structured, semi-structured, and unstructured data;
the data warehouse is used to store cleaned, converted, and consolidated data to support high performance batch analysis.
Further, the market space mining model comprises a blue sea market space mining model and an stock market space mining model;
the blue sea market space mining model comprises a business bid analysis model, a gateway enterprise model and a newly-built enterprise mining model;
the stock market space mining model comprises a demand mining model and a service upgrading model.
The business bid-bidding analysis model comprises a feature extraction unit, a business type screening unit, a bid-bidding state screening unit, other feature screening units, an interference factor eliminating unit and a parameter operation unit;
wherein the feature extraction unit is used for extracting features related to the specified business, such as project names, project descriptions, release dates, expiration dates, bidding institutions, places and the like. When extracting, the text data is subjected to word segmentation, word frequency statistics, emotion analysis and other processes by using Natural Language Processing (NLP) technology so as to extract more information.
The business type screening unit is used for setting the business type or project name keywords to be extracted in the analysis parameter setting panel.
The bid state screening unit is used for setting bid issuing time and bid deadline in the analysis parameter setting panel.
The other feature screening units are used for setting the features of areas, amounts, industries, enterprise scales and the like in the analysis parameter setting panel.
The interference factor eliminating unit is used for setting eliminating item contents including areas, amounts, industries, enterprise scales and item keywords in the analysis parameter setting panel.
The parameter operation unit is used for starting an automation algorithm preset by the platform to operate according to specific parameter setting conditions by using the text classification model and the time sequence model to obtain a result.
The gateway enterprise model comprises an association relation definition unit, a first association element configuration unit, a first model control unit and a first data result storage unit;
the association relation definition unit is used for defining association relations among enterprises according to analysis application requirements, wherein the association relation comprises a stockholder association relation, a business cooperation relation, a fund account relation, an external investment relation and a region relation.
The first association element configuration unit is used for selecting a corresponding association relation analysis model through a parameter setting panel of the analysis model and configuring association elements such as a stakeholder name, a business name, a region where the business name is located and the like.
The first model control unit is used for controlling whether the analysis model operates or not; when the analysis model operates, an SQL tool preset by a platform is used for associating the enterprise with various relation analysis elements, and the enterprise pairs and enterprise groups with the same association factors are found out.
The first data result storage unit is used for automatically storing the analyzed data in a data table so as to conveniently fetch, import and support the visual chart display at any time.
The new enterprise mining model comprises a first date calculation unit, a first data screening unit and a second data result storage unit;
the first date calculating unit is used for calculating the difference between the registration date and the current date of the enterprise in the data lake to determine whether the enterprise is registered in a specified time period. The time period may be set in a parameter setting panel.
The first data screening unit is used for screening enterprises with registration time within a specified time period based on the calculated date difference.
The second data result storage unit is used for automatically storing the screened data in a data table so as to conveniently fetch, import and support the visual chart display at any time.
The requirement mining model comprises an element change monitoring unit, a second associated element configuration unit, a second model control unit and a third data result storage unit;
the element change monitoring unit is used for applying real-time data stream processing technologies (Apache Kafka and Apache Flink) to monitor and analyze real-time change of the data stream in the data lake. And (3) performing key monitoring on elements related to demand analysis, including enterprise address change, branch opening, credit investigation change, operation range change, recruitment demand change and the like.
Further optimizing the scheme, the embodiment uses Apache Kafka and Apache Flink as a stream batch integrated processing engine to support real-time and batch data processing. Real-time data streams are captured, processed, and analyzed in real-time at the stream processing layer, supporting low-latency real-time decisions. Performing large-scale offline data analysis at the batch layer supports complex data mining and report generation.
The second related element configuration unit is used for selecting a corresponding requirement mining analysis model through a parameter setting panel of the analysis model, and configuring related elements for the analysis model, such as marketing business, branch setting time, a branch area, an operation range keyword, a recruitment post name, recruitment number and the like.
The second model control unit is used for controlling whether the analysis model operates or not; when the analysis model runs, the SQL tool preset by the platform is used for associating the marketing business requirement element with the enterprise association element, and the enterprise matched with the marketing business requirement element is found out. When no marketing business is selected, the analytical model will identify requirements associated with new changes, which will relate to personnel, technology, resources, processes, markets, customers, etc.
The third data result storage unit is used for automatically storing the analyzed data in a data table so as to conveniently fetch, import and support the visual chart display at any time.
The service upgrading model comprises a second date calculating unit, a second data screening unit and a fourth data result storage unit;
the second date calculating unit is used for calculating the expiration time of the business transacted by the enterprises in the data lake.
The second data screening unit is used for screening enterprises of which the relevant service expiration time is within a threshold value through setting the service expiration parameter through the setting panel based on the calculated expiration time.
And the fourth data result storage unit is used for automatically storing the screened data in a data table so as to conveniently fetch, import and support the visual chart display at any time.
Further, the model prediction module comprises an information base construction unit, a business machine mining unit and a prediction unit;
the information base construction unit is used for creating database tables according to bidding information of the imported enterprise data, then designing a data model, determining the relation between the tables and obtaining a market bidding information base; the bidding information of the imported enterprise data includes a project name, a bidding party, an expiration date, a contract amount, and the like.
The business machine mining unit is used for building a market space big data mining model library based on a market bidding information base and combining with local network stock enterprise client information to mine the development space of blue sea market, stock market and other network market government enterprise business;
and the prediction unit is used for visually displaying and exporting the screened enterprise data and service matching report to a file according to the needs of a user after market space mining.
Further, the risk prediction module comprises an information change monitoring unit, a negative information detection unit, a negative information collection unit, a risk early warning unit and a visual output unit;
the information change monitoring unit is used for monitoring and analyzing the real-time change of the data flow in the data lake by applying a real-time data flow processing technology and obtaining change information of each dimension of an enterprise;
the present embodiment uses Apache Kafka and Apache Flink as a streaming batch integrated processing engine to support real-time and batch data processing. Real-time data streams are captured, processed, and analyzed in real-time at the stream processing layer, supporting low-latency real-time decisions. Performing large-scale offline data analysis at the batch layer supports complex data mining and report generation.
The negative information detection unit is used for integrating different algorithms of keyword matching, topic modeling, word embedding, emotion analysis, rule engine and supervised learning, and integrating analysis results of the market space mining model;
the negative information collecting unit is used for automatically selecting a trained model, applying the trained model to text data to be classified, and classifying the text data into different negative information categories;
the risk early warning unit is used for notifying related personnel through an email and a short message when the enterprise has risks;
the visual output unit is used for visually displaying the risk analysis result and exporting a risk analysis report.
Example two
As shown in fig. 2, the method for reconnaissance of enterprise information provided by the invention comprises the following steps:
expanding a data source based on initial enterprise client data, introducing business cooperation data, collecting website data of credit, vertical portals and enterprise news, and obtaining target enterprise data through data planning processing;
based on target enterprise data, integrating enterprise business opportunity information to construct a market space mining model; carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
and carrying out risk judgment and early warning according to the prediction result, obtaining the abnormal condition of enterprise operation, and outputting a visual abnormal result and an analysis report.
Further, the process of obtaining the target enterprise data through the data planning process includes,
based on enterprise client data, internet data sources are comprehensively expanded, website data such as credit, vertical portals and enterprise news are collected simultaneously by introducing business cooperation data, and a full-scale, unique and dynamic enterprise big data mart is formed through data planning processing. The method comprises the following steps:
step 1: data aggregation: the SQOOP component is utilized to import the data of the relational database into the HADOOP or HIVE through the SQOOP command. And the OOZIE component is adopted to process serial and parallel tasks imported by data, so that the dependency relationship between the tasks is solved.
Step 2: data cleaning and integration: when the Hadoop component, the Hive component and the SPARK component are adopted, dirty data containing useless information is subjected to consistency check, invalid value and missing value processing, repeated data processing and data standardization processing when the data is imported.
Step 3: and (3) data storage: and (3) the processed structured data are stored into the corresponding databases through interfaces of the relational databases such as SPARK call ORACLE, MYSQL and the like and the memory database REDIS.
Further optimizing the scheme, the Data storage of the implementation uses a private Data center to establish a Data Lake (Data Lake) and a Data Warehouse (Data Warehouse). Wherein the data lake is used to store raw, unprocessed data, including structured, semi-structured, and unstructured data. The data warehouse is used to store cleaned, converted, and consolidated data to support high performance batch analysis.
Step 4: designing a data service interface: the RESTful design principle is used to define the end points (Endpoints) of the API and the data interaction mode. The data format of the API is defined using JSON or XML, and the version number is included in the URL in consideration of API version control.
Further, the enterprise business condition analysis and business opportunity mining are carried out according to the market space mining model, and the process of obtaining the prediction result comprises,
creating database tables according to bidding information of imported enterprise data, designing a data model, determining the relation between the tables, and obtaining a market bidding information base;
based on a market bidding information base, building a market space big data mining model base by combining local network stock enterprise client information, and mining the development space of blue sea market, stock market and other network market government enterprise business;
after market space mining is carried out through the market space mining model, a user visually displays and exports the screened enterprise data and business matching report to a file according to the needs.
And further optimizing a scheme, based on enterprise client data, integrating enterprise business opportunity information, and constructing various types of market space mining models, so that enterprise business condition analysis is realized, and the purpose of mining business opportunities is achieved. The method comprises the following 3 steps:
step 1: and (5) building a market bidding information base.
Sub-step 1: bidding information for the imported enterprise data is provided, including item names, bidding parties, expiration dates, contract amounts, and the like.
Sub-step 2: a database table is created to store the bidding information.
Sub-step 3: and designing a data model, and determining the relation between tables.
Step 2: and carrying out business machine mining based on the market space mining model.
Based on the market bidding information base, a market space big data mining model base is built by combining the client information of the local network stock enterprises, and the development space of the blue sea market, the stock market and other network market government enterprise businesses is mined to energize the government enterprise business marketing.
1. Blue sea market space excavation model
(1) Business bid analysis model:
step 1: features associated with a specified business, such as project name, project description, release date, expiration date, bidding mechanism, location, etc., are extracted. When extracting, the text data is subjected to word segmentation, word frequency statistics, emotion analysis and other processes by using Natural Language Processing (NLP) technology so as to extract more information.
Step 2: service type screening: in the analysis parameter setting panel, a service type or item name keyword to be extracted is set.
Step 3: bid-bidding state screening: the bid issue time and the bid expiration time are set in the analysis parameter set panel.
Step 4: other feature screening: the characteristics of the area, the amount of money, the industry, the enterprise scale and the like can be selectively set in the analysis parameter setting panel.
Step 5: interference factor rejection: the content of the reject item can be selectively set in the analysis parameter setting panel, including area, amount, industry, enterprise scale and item keywords.
Step 6: and starting an automation algorithm preset by the platform according to the specific parameter setting condition by using the text classification model and the time sequence model to calculate so as to obtain a result.
(2) The gateway is connected with an enterprise model:
step 1: defining an association relation: and defining the association relationship among enterprises according to the analysis application requirements, wherein the association relationship comprises a stakeholder association relationship, a business cooperation relationship, a fund account relationship, an external investment relationship and a region relationship.
Step 2: configuring an associated element: and selecting a corresponding association relation analysis model through a parameter setting panel of the analysis model, and configuring association elements such as a stakeholder name, a business name, a region and the like.
Step 3: running an analysis model: and (3) using an SQL tool preset by a platform to correlate the enterprise with various relation analysis elements, and finding out enterprise pairs and enterprise groups with the same correlation factors.
Step 4: and storing the data result, and automatically storing the analyzed data in a data table so as to call, import and support the visual chart display at any time.
(3) Newly establishing an enterprise mining model:
step 1: date calculation: the difference between the date of registration of the business in the data lake and the current date is calculated to determine whether the business is registered during the specified time period. The time period may be set in a parameter setting panel.
Step 2: screening data: based on the calculated date difference, the enterprises with the registration time within the appointed time period are screened out.
Step 3: and storing the data result, and automatically storing the screened data in a data table so as to call, import and support visual chart display at any time.
2. Stock market space mining model
(1) Demand mining model (address change, branch opening, business demand change, recruitment increase)
Step 1: monitoring changes in demand analysis elements: real-time changes in data streams in data lakes are monitored and analyzed using real-time data stream processing techniques (Apache Kafka and Apache Flink). And (3) performing key monitoring on elements related to demand analysis, including enterprise address change, branch opening, credit investigation change, operation range change, recruitment demand change and the like.
Step 2: configuring association elements for an analytical model: and selecting a corresponding requirement mining analysis model through a parameter setting panel of the analysis model, and configuring related elements such as marketing business, branch setting time, a branch area, an operation range keyword, a recruitment post name, recruitment number and the like for the analysis model.
Step 3: running an analysis model: and (3) correlating the marketing business demand element with the enterprise correlation element by using an SQL tool preset by the platform, and finding out the enterprise matched with the marketing business demand element. When no marketing business is selected, the analytical model will identify requirements associated with new changes, which will relate to personnel, technology, resources, processes, markets, customers, etc.
Step 4: and storing the data result, and automatically storing the analyzed data in a data table so as to call, import and support the visual chart display at any time.
(2) Service upgrade model:
step 1: date calculation: and calculating the expiration time of the business transacted by the enterprises in the data lake.
Step 2: screening data: and based on the calculated expiration time, setting service expiration parameters through a setting panel, and screening enterprises of which the related service expiration time is within the expiration value.
Step 3: and storing the data result, and automatically storing the screened data in a data table so as to call, import and support visual chart display at any time.
Step 3: and outputting a target enterprise list and a business matching report.
After market space mining is carried out through the mining models, a user can select the screened enterprise data and service matching reports to carry out visual display and export to a file according to the needs.
Further, the risk judging and early warning process according to the prediction result comprises the steps of,
the platform carries out risk judgment and early warning on the management abnormality, judicial risk, telecom risk and reputation risk of the enterprise by comprehensively analyzing the information such as the basic information of the enterprise, the enterprise management data, the illegal records, the social security participation data, the tax owed data, the public opinion data and the like, so that a user can master the management abnormality of a target enterprise for the first time.
Step 1: monitoring changes in enterprise information: and (3) monitoring and analyzing the real-time change of the data flow in the data lake by using a real-time data flow processing technology (Apache Kafka and Apache Flink) to acquire change information of each dimension of the enterprise.
Further optimizing the scheme, the embodiment uses Apache Kafka and Apache Flink as a stream batch integrated processing engine to support real-time and batch data processing. Real-time data streams are captured, processed, and analyzed in real-time at the stream processing layer, supporting low-latency real-time decisions. Performing large-scale offline data analysis at the batch layer supports complex data mining and report generation.
Step 2: detection of negative information: the platform integrates a plurality of different algorithms such as keyword matching, topic modeling, word embedding, emotion analysis, rule engines, supervised learning and the like, integrates analysis results of all models, and improves identification accuracy of negative information.
Step 3: collection of negative information: the platform automatically selects the trained model to apply to the text data to be classified to categorize it into different negative information categories. The model assigns one or more category labels to each text sample, indicating the category to which it belongs.
Step 4: risk early warning: when the enterprise is at risk, the platform timely informs related personnel through the E-mail and the short message.
Step 5: visualization or output data: and according to the requirement, the risk analysis result can be selected to be visually displayed, and the risk analysis report can be exported.
The invention builds the enterprise big data mart around two application directions of business condition analysis and risk management, so that the reconnaissance platform has application capacities of business condition analysis, market space excavation, enterprise business opportunity excavation, enterprise risk monitoring and the like.
The foregoing is merely a preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions easily conceivable by those skilled in the art within the technical scope of the present application should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. An enterprise information reconnaissance platform, comprising:
the data acquisition module is used for expanding a data source based on initial enterprise client data, introducing business cooperation data and simultaneously collecting website data of credit investigation, vertical portals and enterprise news;
the data processing module is connected with the data acquisition module and is used for carrying out data planning processing on the expansion data to obtain target enterprise data;
the model construction module is connected with the data processing module and used for fusing enterprise business opportunity information based on the target enterprise data to construct a market space mining model;
the model prediction module is connected with the model construction module and is used for carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
the risk prediction module is connected with the model prediction module and used for carrying out risk judgment and early warning according to the prediction result, obtaining abnormal conditions of enterprise operation and outputting a visual risk analysis result and an analysis report.
2. The enterprise information reconnaissance platform of claim 1,
the data processing module comprises a data aggregation unit, a data cleaning and integrating unit, a data storage unit and an interface design unit;
the data aggregation unit is used for importing the data of the relational database into the Hadoop or the HIVE through the SQOOP command by utilizing the SQOOP component; the system is also used for processing serial and parallel tasks of data import by adopting an OOZIE component;
when the data cleaning and integrating unit is used for importing data, a Hadoop component, a Hive component and a SPARK component are adopted to carry out consistency check, invalid value and missing value processing, repeated data processing and data standardization processing on dirty data containing useless information;
the data storage unit is used for storing the processed structured data into the corresponding database through interfaces of the SPARK calling ORACLE, the MYSQL relational database and the memory database REDIS;
the interface design unit is used for defining the endpoint and the data interaction mode of the API by using RESTful design principle and defining the data format of the API by using JSON or XML.
3. The enterprise information reconnaissance platform of claim 2,
the data storage unit comprises a data lake and a data warehouse;
the data lake is used for storing original and unprocessed data, including structured, semi-structured and unstructured data;
the data warehouse is used for storing data subjected to cleaning, conversion and arrangement so as to support high-performance batch analysis.
4. The enterprise information reconnaissance platform of claim 1,
the market space mining model comprises a blue sea market space mining model and an stock market space mining model;
the blue sea market space mining model comprises a business bid analysis model, a gateway enterprise model and a newly-built enterprise mining model;
the stock market space mining model comprises a demand mining model and a service upgrading model.
5. The enterprise information reconnaissance platform of claim 1,
the model prediction module comprises an information base construction unit, a business machine mining unit and a prediction unit;
the information base construction unit is used for creating database tables according to bidding information of imported enterprise data, then designing a data model, determining the relation between the tables and obtaining a market bidding information base;
the business machine mining unit is used for building a market space big data mining model library based on the market bidding information base and combining the client information of the local network stock enterprises, and mining the development space of blue sea market, stock market and other network market government enterprise business;
after market space mining, the prediction unit is used for visually displaying and exporting the screened enterprise data and business matching reports to a file according to the needs of a user.
6. The enterprise information reconnaissance platform of claim 1,
the risk prediction module comprises an information change monitoring unit, a negative information detection unit, a negative information collection unit, a risk early warning unit and a visual output unit;
the information change monitoring unit is used for monitoring and analyzing the real-time change of the data flow in the data lake by applying a real-time data flow processing technology and obtaining change information of each dimension of an enterprise;
the negative information detection unit is used for integrating different algorithms of keyword matching, topic modeling, word embedding, emotion analysis, rule engine and supervised learning to integrate analysis results of the market space mining model;
the negative information collecting unit is used for automatically selecting a trained model, applying the trained model to text data to be classified, and classifying the text data into different negative information categories;
the risk early warning unit is used for notifying related personnel through an email and a short message when the enterprise has risks;
the visual output unit is used for visually displaying the risk analysis result and exporting a risk analysis report.
7. The enterprise information reconnaissance method is characterized by comprising the following steps of:
expanding a data source based on initial enterprise client data, introducing business cooperation data, collecting website data of credit, vertical portals and enterprise news, and obtaining target enterprise data through data planning processing;
based on the target enterprise data, fusing enterprise business opportunity information and constructing a market space mining model; carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
and carrying out risk judgment and early warning according to the prediction result, obtaining the abnormal condition of enterprise operation, and outputting a visual abnormal result and an analysis report.
8. The method of claim 7, wherein,
the process of obtaining target enterprise data through the data planning process includes,
the SQOOP component is utilized to import the data of the relational database into the HADOOP or HIVE through the SQOOP command, and the OOZIE component is utilized to process serial and parallel tasks of data import;
when data is imported, a Hadoop component, a Hive component and a SPARK component are adopted to carry out consistency check, invalid value and missing value processing, repeated data processing and data standardization processing on dirty data containing useless information;
the processed structured data is stored into a corresponding database through interfaces of a SPARK calling ORACLE, MYSQL and other relational databases and a memory database REDIS;
the data service interface is designed, the RESTful design principle is used for defining the end points and the data interaction modes of the API, and the JSON or XML is used for defining the data format of the API.
9. The method of claim 7, wherein,
carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model, and obtaining the prediction result comprises the following steps of,
creating database tables according to bidding information of imported enterprise data, designing a data model, determining the relation between the tables, and obtaining a market bidding information base;
based on the market bidding information base, building a market space big data mining model base by combining the client information of the local network stock enterprises, and mining the development space of blue sea market, stock market and other network market government enterprise services;
after market space mining is carried out through the market space mining model, a user visually displays and exports the screened enterprise data and business matching report to a file according to the needs.
10. The method of claim 7, wherein,
the risk judging and early warning process according to the prediction result comprises,
the method comprises the steps of applying a real-time data stream processing technology to monitor and analyze real-time change of data streams in a data lake and obtaining change information of each dimension of an enterprise;
integrating different algorithms of keyword matching, topic modeling, word embedding, emotion analysis, rule engine and supervised learning, and integrating analysis results of the market space mining model;
automatically selecting a trained model, applying the trained module to text data to be classified, and classifying the text data into different negative information categories;
when the enterprise has risks, notifying related personnel through emails and short messages;
and visually displaying the risk analysis result as required, and exporting a risk analysis report.
CN202311322724.3A 2023-10-12 2023-10-12 Enterprise information reconnaissance platform and method Pending CN117391440A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311322724.3A CN117391440A (en) 2023-10-12 2023-10-12 Enterprise information reconnaissance platform and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311322724.3A CN117391440A (en) 2023-10-12 2023-10-12 Enterprise information reconnaissance platform and method

Publications (1)

Publication Number Publication Date
CN117391440A true CN117391440A (en) 2024-01-12

Family

ID=89469473

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311322724.3A Pending CN117391440A (en) 2023-10-12 2023-10-12 Enterprise information reconnaissance platform and method

Country Status (1)

Country Link
CN (1) CN117391440A (en)

Similar Documents

Publication Publication Date Title
Wang et al. Industrial big data analytics: challenges, methodologies, and applications
CN108572967B (en) Method and device for creating enterprise portrait
CN109597936B (en) New user screening system and method
US20140012800A1 (en) Apparatus and method for providing application for processing big data
CN112181960B (en) Intelligent operation and maintenance framework system based on AIOps
CN115423289B (en) Intelligent plate processing workshop data processing method and terminal
CN110851667A (en) Integrated analysis method and tool for multi-source large data
CN114880405A (en) Data lake-based data processing method and system
CN111062600A (en) Model evaluation method, system, electronic device, and computer-readable storage medium
Islam et al. A framework for effective big data analytics for decision support systems
CN117391440A (en) Enterprise information reconnaissance platform and method
Kohli et al. Big Data Analytics: An Overview
CN114116667A (en) Data management system for power data application scene
CN113886465A (en) Big data analysis platform for automobile logistics
CN114723548A (en) Data processing method, apparatus, device, medium, and program product
Yang Multivariate statistical methods and Six-Sigma
Grambau et al. Reference Architecture framework for enhanced social media data analytics for Predictive Maintenance models
Stubarev et al. Development of the analytical platform for CRM-system
CN110689241A (en) Power grid physical asset evaluation system based on big data
CN111612302A (en) Group-level data management method and equipment
Bai The application of customer relationship management and data mining in Chinese insurance companies
Han et al. Logistics Supply Chain Management Mode of Chinese E-Commerce Enterprises under the Background of Big Data and Internet of Things
Fu et al. Management of Power Marketing Audit Work Based on Tobit Model and Big Data Technology
US20230015637A1 (en) Method and System for Analyzing Data in a Database
CN113934769A (en) Intelligent data analysis method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination