CN117391440A - Enterprise information reconnaissance platform and method - Google Patents
Enterprise information reconnaissance platform and method Download PDFInfo
- Publication number
- CN117391440A CN117391440A CN202311322724.3A CN202311322724A CN117391440A CN 117391440 A CN117391440 A CN 117391440A CN 202311322724 A CN202311322724 A CN 202311322724A CN 117391440 A CN117391440 A CN 117391440A
- Authority
- CN
- China
- Prior art keywords
- data
- enterprise
- model
- information
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000005065 mining Methods 0.000 claims abstract description 87
- 238000004458 analytical method Methods 0.000 claims abstract description 82
- 238000012545 processing Methods 0.000 claims abstract description 59
- 230000000007 visual effect Effects 0.000 claims abstract description 22
- 238000005516 engineering process Methods 0.000 claims abstract description 14
- 230000002159 abnormal effect Effects 0.000 claims abstract description 11
- 238000011161 development Methods 0.000 claims abstract description 8
- 238000011835 investigation Methods 0.000 claims abstract description 6
- 230000008859 change Effects 0.000 claims description 31
- 230000008569 process Effects 0.000 claims description 18
- 238000012544 monitoring process Methods 0.000 claims description 17
- 230000008676 import Effects 0.000 claims description 16
- 238000012502 risk assessment Methods 0.000 claims description 15
- 238000013461 design Methods 0.000 claims description 12
- 238000004422 calculation algorithm Methods 0.000 claims description 11
- 238000007418 data mining Methods 0.000 claims description 11
- 238000013500 data storage Methods 0.000 claims description 11
- 238000004140 cleaning Methods 0.000 claims description 9
- 230000008451 emotion Effects 0.000 claims description 9
- 230000002776 aggregation Effects 0.000 claims description 7
- 238000004220 aggregation Methods 0.000 claims description 7
- 238000013499 data model Methods 0.000 claims description 7
- 238000001514 detection method Methods 0.000 claims description 7
- 238000009411 base construction Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 6
- 230000003993 interaction Effects 0.000 claims description 6
- 238000005111 flow chemistry technique Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 230000008901 benefit Effects 0.000 abstract description 3
- 238000012216 screening Methods 0.000 description 18
- 238000007405 data analysis Methods 0.000 description 9
- 238000003860 storage Methods 0.000 description 8
- 230000007115 recruitment Effects 0.000 description 7
- 230000010354 integration Effects 0.000 description 6
- 230000018109 developmental process Effects 0.000 description 5
- 238000007726 management method Methods 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 4
- 238000009412 basement excavation Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 2
- 238000013145 classification model Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000013515 script Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06395—Quality analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0203—Market surveys; Market polls
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Marketing (AREA)
- Finance (AREA)
- Game Theory and Decision Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Data Mining & Analysis (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses an enterprise information reconnaissance platform and method, comprising expanding a data source based on initial enterprise client data, introducing business cooperation data, collecting website data of credit investigation, vertical portals and enterprise news, and obtaining target enterprise data through data planning processing; based on target enterprise data, integrating enterprise business opportunity information to construct a market space mining model; carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result; and carrying out risk judgment and early warning according to the prediction result, obtaining the abnormal condition of enterprise operation, and outputting a visual abnormal result and an analysis report. The reconnaissance platform is built based on a lake and warehouse integrated technology and a flow batch integrated technology, can better meet the analysis requirement of enterprise data, provides comprehensive data insight, supports real-time decision making and data-driven business development, and simultaneously has the advantages of flexibility, comprehensiveness and performance.
Description
Technical Field
The invention belongs to the field of information reconnaissance, and particularly relates to an enterprise information reconnaissance platform and method.
Background
The enterprise information reconnaissance platform belongs to a big data platform, and research and development of the big data platform in the current market presents diversity and universality in technical application, but has more or less following defects in data security, real-time data integration benefit, data processing and analysis performance, algorithm and model:
1. data security problem: the platform may store sensitive information, which if improperly protected, may result in data leakage. While improper data access rights settings may result in unauthorized persons accessing sensitive data.
2. Real-time data integration problem: the performance requirements for real-time data stream integration and batch data integration are different. Real-time streaming requires low latency and high throughput, while batch processing is more focused on efficient large-scale data analysis. Most of the current platforms cannot achieve good balance between meeting the two requirements, the situation that the two requirements are out of phase occurs, the requirements of real-time application programs cannot be well met, and batch data cannot be processed efficiently.
3. Data processing and analysis problems: the platform requires a large amount of computing resources, potentially resulting in a slow analysis. At the same time, integrating data from different data sources into a platform can be very complex, requiring multiple issues of data formats, protocols, etc., and supporting different types of data, including structured, semi-structured, and unstructured data, which all require complex processing methods.
4. Algorithm and model problems: most data platforms require users or developers to manually write appropriate model scripts, which requires extensive field knowledge and experience by the operator, which is a significant challenge for non-professional data analysts. The platform does not provide an automated data analysis model, which also results in poor performance and inefficiency, and the use of an improper model also results in inaccurate analysis results.
Disclosure of Invention
In order to solve the problems, the invention provides the following scheme: an enterprise information reconnaissance platform and a method. Wherein, an enterprise information reconnaissance platform includes:
the data acquisition module is used for expanding a data source based on initial enterprise client data, introducing business cooperation data and simultaneously collecting website data of credit investigation, vertical portals and enterprise news;
the data processing module is connected with the data acquisition module and is used for carrying out data planning processing on the expansion data to obtain target enterprise data;
the model construction module is connected with the data processing module and used for fusing enterprise business opportunity information based on the target enterprise data to construct a market space mining model;
the model prediction module is connected with the model construction module and is used for carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
the risk prediction module is connected with the model prediction module and used for carrying out risk judgment and early warning according to the prediction result, obtaining abnormal conditions of enterprise operation and outputting a visual risk analysis result and an analysis report.
Preferably, the data processing module comprises a data aggregation unit, a data cleaning and integrating unit, a data storage unit and an interface design unit;
the data aggregation unit is used for importing the data of the relational database into the Hadoop or the HIVE through the SQOOP command by utilizing the SQOOP component; the system is also used for processing serial and parallel tasks of data import by adopting an OOZIE component;
when the data cleaning and integrating unit is used for importing data, a Hadoop component, a Hive component and a SPARK component are adopted to carry out consistency check, invalid value and missing value processing, repeated data processing and data standardization processing on dirty data containing useless information;
the data storage unit is used for storing the processed structured data into the corresponding database through interfaces of the SPARK calling ORACLE, the MYSQL relational database and the memory database REDIS;
the interface design unit is used for defining the endpoint and the data interaction mode of the API by using RESTful design principle and defining the data format of the API by using JSON or XML.
Preferably, the data storage unit comprises a data lake and a data warehouse;
the data lake is used for storing original and unprocessed data, including structured, semi-structured and unstructured data;
the data warehouse is used for storing data subjected to cleaning, conversion and arrangement so as to support high-performance batch analysis.
Preferably, the market space mining model comprises a blue sea market space mining model and an stock market space mining model;
the blue sea market space mining model comprises a business bid analysis model, a gateway enterprise model and a newly-built enterprise mining model;
the stock market space mining model comprises a demand mining model and a service upgrading model.
Preferably, the model prediction module comprises an information base construction unit, a business machine mining unit and a prediction unit;
the information base construction unit is used for creating database tables according to bidding information of imported enterprise data, then designing a data model, determining the relation between the tables and obtaining a market bidding information base;
the business machine mining unit is used for building a market space big data mining model library based on the market bidding information base and combining the client information of the local network stock enterprises, and mining the development space of blue sea market, stock market and other network market government enterprise business;
after market space mining, the prediction unit is used for visually displaying and exporting the screened enterprise data and business matching reports to a file according to the needs of a user.
Preferably, the risk prediction module comprises an information change monitoring unit, a negative information detection unit, a negative information collection unit, a risk early warning unit and a visual output unit;
the information change monitoring unit is used for monitoring and analyzing the real-time change of the data flow in the data lake by applying a real-time data flow processing technology and obtaining change information of each dimension of an enterprise;
the negative information detection unit is used for integrating different algorithms of keyword matching, topic modeling, word embedding, emotion analysis, rule engine and supervised learning to integrate analysis results of the market space mining model;
the negative information collecting unit is used for automatically selecting a trained model, applying the trained model to text data to be classified, and classifying the text data into different negative information categories;
the risk early warning unit is used for notifying related personnel through an email and a short message when the enterprise has risks;
the visual output unit is used for visually displaying the risk analysis result and exporting a risk analysis report.
The invention also provides an enterprise information reconnaissance method, which comprises the following steps:
expanding a data source based on initial enterprise client data, introducing business cooperation data, collecting website data of credit, vertical portals and enterprise news, and obtaining target enterprise data through data planning processing;
based on the target enterprise data, fusing enterprise business opportunity information and constructing a market space mining model; carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
and carrying out risk judgment and early warning according to the prediction result, obtaining the abnormal condition of enterprise operation, and outputting a visual abnormal result and an analysis report.
Preferably, the process of obtaining the target enterprise data through the data planning process includes,
the SQOOP component is utilized to import the data of the relational database into the HADOOP or HIVE through the SQOOP command, and the OOZIE component is utilized to process serial and parallel tasks of data import;
when data is imported, a Hadoop component, a Hive component and a SPARK component are adopted to carry out consistency check, invalid value and missing value processing, repeated data processing and data standardization processing on dirty data containing useless information;
the processed structured data is stored into a corresponding database through interfaces of a SPARK calling ORACLE, MYSQL and other relational databases and a memory database REDIS;
the data service interface is designed, the RESTful design principle is used for defining the end points and the data interaction modes of the API, and the JSON or XML is used for defining the data format of the API.
Preferably, the enterprise business condition analysis and business opportunity mining are carried out according to the market space mining model, and the process of obtaining the prediction result comprises,
creating database tables according to bidding information of imported enterprise data, designing a data model, determining the relation between the tables, and obtaining a market bidding information base;
based on the market bidding information base, building a market space big data mining model base by combining the client information of the local network stock enterprises, and mining the development space of blue sea market, stock market and other network market government enterprise services;
after market space mining is carried out through the market space mining model, a user visually displays and exports the screened enterprise data and business matching report to a file according to the needs.
Preferably, the risk judging and early warning process according to the prediction result comprises the steps of,
the method comprises the steps of applying a real-time data stream processing technology to monitor and analyze real-time change of data streams in a data lake and obtaining change information of each dimension of an enterprise;
integrating different algorithms of keyword matching, topic modeling, word embedding, emotion analysis, rule engine and supervised learning, and integrating analysis results of the market space mining model;
automatically selecting a trained model, applying the trained module to text data to be classified, and classifying the text data into different negative information categories;
when the enterprise has risks, notifying related personnel through emails and short messages;
and visually displaying the risk analysis result as required, and exporting a risk analysis report.
Compared with the prior art, the invention has the following advantages and technical effects:
1. based on the integrated technology of the lake and the warehouse, the data warehouse and the data lake are opened, and the data lake is used for storing original and unprocessed data, including structured, semi-structured and unstructured data. The data warehouse is used to store cleaned, converted, and consolidated data to support high performance batch analysis.
2. Apache Kafka and Apache Flink are used as a streaming batch integrated processing engine to support real-time and batch data processing. Capturing, processing and analyzing real-time data streams in real time at a stream processing layer, supporting low-delay real-time decisions; performing large-scale offline data analysis at the batch layer supports complex data mining and report generation. The real-time data stream integration and batch processing data achieve better balance.
3. The platform integrates a plurality of different algorithms such as keyword matching, topic modeling, word embedding, emotion analysis, rule engines, supervised learning and the like, and integrates analysis results of all models, so that identification accuracy of negative information is greatly improved.
4. Aiming at enterprise data characteristics, the platform presets various data analysis models around different market space mining targets, and an operator can call a proper model only by simple parameter setting through the model setting panel to realize complex data analysis. The professional knowledge threshold of the user is greatly reduced, and the data analysis performance and efficiency are improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application, illustrate and explain the application and are not to be construed as limiting the application. In the drawings:
FIG. 1 is a schematic diagram of a system architecture according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method according to an embodiment of the invention.
Detailed Description
It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
Example 1
As shown in fig. 1, the enterprise information reconnaissance platform provided by the present invention includes:
the data acquisition module is used for expanding a data source based on initial enterprise client data, introducing business cooperation data and simultaneously collecting website data of credit investigation, vertical portals and enterprise news;
the data processing module is connected with the data acquisition module and is used for carrying out data planning processing on the expanded data to obtain target enterprise data;
the model construction module is connected with the data processing module and used for fusing enterprise business opportunity information based on target enterprise data to construct a market space mining model;
the model prediction module is connected with the model construction module and is used for carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
the risk prediction module is connected with the model prediction module and is used for carrying out risk judgment and early warning according to the prediction result, obtaining the abnormal condition of enterprise operation and outputting a visual risk analysis result and an analysis report.
Further, the data processing module comprises a data aggregation unit, a data cleaning and integrating unit, a data storage unit and an interface design unit;
the data aggregation unit is used for importing the data of the relational database into the HADOOP or HIVE through the SQOOP command by utilizing the SQOOP component; the system is also used for processing serial and parallel tasks of data import by adopting an OOZIE component;
when the data cleaning and integrating unit is used for importing data, a Hadoop component, a Hive component and a SPARK component are adopted to carry out consistency check, invalid value and missing value processing, repeated data processing and data standardization processing on dirty data containing useless information;
the data storage unit is used for storing the processed structured data into the corresponding database through interfaces of the SPARK calling ORACLE, the MYSQL relational database and the memory database REDIS;
the interface design unit is used for defining the endpoint and the data interaction mode of the API by using RESTful design principle and defining the data format of the API by using JSON or XML, and the version number is included in the URL in consideration of API version control.
Further, the data storage unit comprises a data lake and a data warehouse;
the data lake is used to store raw, unprocessed data, including structured, semi-structured, and unstructured data;
the data warehouse is used to store cleaned, converted, and consolidated data to support high performance batch analysis.
Further, the market space mining model comprises a blue sea market space mining model and an stock market space mining model;
the blue sea market space mining model comprises a business bid analysis model, a gateway enterprise model and a newly-built enterprise mining model;
the stock market space mining model comprises a demand mining model and a service upgrading model.
The business bid-bidding analysis model comprises a feature extraction unit, a business type screening unit, a bid-bidding state screening unit, other feature screening units, an interference factor eliminating unit and a parameter operation unit;
wherein the feature extraction unit is used for extracting features related to the specified business, such as project names, project descriptions, release dates, expiration dates, bidding institutions, places and the like. When extracting, the text data is subjected to word segmentation, word frequency statistics, emotion analysis and other processes by using Natural Language Processing (NLP) technology so as to extract more information.
The business type screening unit is used for setting the business type or project name keywords to be extracted in the analysis parameter setting panel.
The bid state screening unit is used for setting bid issuing time and bid deadline in the analysis parameter setting panel.
The other feature screening units are used for setting the features of areas, amounts, industries, enterprise scales and the like in the analysis parameter setting panel.
The interference factor eliminating unit is used for setting eliminating item contents including areas, amounts, industries, enterprise scales and item keywords in the analysis parameter setting panel.
The parameter operation unit is used for starting an automation algorithm preset by the platform to operate according to specific parameter setting conditions by using the text classification model and the time sequence model to obtain a result.
The gateway enterprise model comprises an association relation definition unit, a first association element configuration unit, a first model control unit and a first data result storage unit;
the association relation definition unit is used for defining association relations among enterprises according to analysis application requirements, wherein the association relation comprises a stockholder association relation, a business cooperation relation, a fund account relation, an external investment relation and a region relation.
The first association element configuration unit is used for selecting a corresponding association relation analysis model through a parameter setting panel of the analysis model and configuring association elements such as a stakeholder name, a business name, a region where the business name is located and the like.
The first model control unit is used for controlling whether the analysis model operates or not; when the analysis model operates, an SQL tool preset by a platform is used for associating the enterprise with various relation analysis elements, and the enterprise pairs and enterprise groups with the same association factors are found out.
The first data result storage unit is used for automatically storing the analyzed data in a data table so as to conveniently fetch, import and support the visual chart display at any time.
The new enterprise mining model comprises a first date calculation unit, a first data screening unit and a second data result storage unit;
the first date calculating unit is used for calculating the difference between the registration date and the current date of the enterprise in the data lake to determine whether the enterprise is registered in a specified time period. The time period may be set in a parameter setting panel.
The first data screening unit is used for screening enterprises with registration time within a specified time period based on the calculated date difference.
The second data result storage unit is used for automatically storing the screened data in a data table so as to conveniently fetch, import and support the visual chart display at any time.
The requirement mining model comprises an element change monitoring unit, a second associated element configuration unit, a second model control unit and a third data result storage unit;
the element change monitoring unit is used for applying real-time data stream processing technologies (Apache Kafka and Apache Flink) to monitor and analyze real-time change of the data stream in the data lake. And (3) performing key monitoring on elements related to demand analysis, including enterprise address change, branch opening, credit investigation change, operation range change, recruitment demand change and the like.
Further optimizing the scheme, the embodiment uses Apache Kafka and Apache Flink as a stream batch integrated processing engine to support real-time and batch data processing. Real-time data streams are captured, processed, and analyzed in real-time at the stream processing layer, supporting low-latency real-time decisions. Performing large-scale offline data analysis at the batch layer supports complex data mining and report generation.
The second related element configuration unit is used for selecting a corresponding requirement mining analysis model through a parameter setting panel of the analysis model, and configuring related elements for the analysis model, such as marketing business, branch setting time, a branch area, an operation range keyword, a recruitment post name, recruitment number and the like.
The second model control unit is used for controlling whether the analysis model operates or not; when the analysis model runs, the SQL tool preset by the platform is used for associating the marketing business requirement element with the enterprise association element, and the enterprise matched with the marketing business requirement element is found out. When no marketing business is selected, the analytical model will identify requirements associated with new changes, which will relate to personnel, technology, resources, processes, markets, customers, etc.
The third data result storage unit is used for automatically storing the analyzed data in a data table so as to conveniently fetch, import and support the visual chart display at any time.
The service upgrading model comprises a second date calculating unit, a second data screening unit and a fourth data result storage unit;
the second date calculating unit is used for calculating the expiration time of the business transacted by the enterprises in the data lake.
The second data screening unit is used for screening enterprises of which the relevant service expiration time is within a threshold value through setting the service expiration parameter through the setting panel based on the calculated expiration time.
And the fourth data result storage unit is used for automatically storing the screened data in a data table so as to conveniently fetch, import and support the visual chart display at any time.
Further, the model prediction module comprises an information base construction unit, a business machine mining unit and a prediction unit;
the information base construction unit is used for creating database tables according to bidding information of the imported enterprise data, then designing a data model, determining the relation between the tables and obtaining a market bidding information base; the bidding information of the imported enterprise data includes a project name, a bidding party, an expiration date, a contract amount, and the like.
The business machine mining unit is used for building a market space big data mining model library based on a market bidding information base and combining with local network stock enterprise client information to mine the development space of blue sea market, stock market and other network market government enterprise business;
and the prediction unit is used for visually displaying and exporting the screened enterprise data and service matching report to a file according to the needs of a user after market space mining.
Further, the risk prediction module comprises an information change monitoring unit, a negative information detection unit, a negative information collection unit, a risk early warning unit and a visual output unit;
the information change monitoring unit is used for monitoring and analyzing the real-time change of the data flow in the data lake by applying a real-time data flow processing technology and obtaining change information of each dimension of an enterprise;
the present embodiment uses Apache Kafka and Apache Flink as a streaming batch integrated processing engine to support real-time and batch data processing. Real-time data streams are captured, processed, and analyzed in real-time at the stream processing layer, supporting low-latency real-time decisions. Performing large-scale offline data analysis at the batch layer supports complex data mining and report generation.
The negative information detection unit is used for integrating different algorithms of keyword matching, topic modeling, word embedding, emotion analysis, rule engine and supervised learning, and integrating analysis results of the market space mining model;
the negative information collecting unit is used for automatically selecting a trained model, applying the trained model to text data to be classified, and classifying the text data into different negative information categories;
the risk early warning unit is used for notifying related personnel through an email and a short message when the enterprise has risks;
the visual output unit is used for visually displaying the risk analysis result and exporting a risk analysis report.
Example two
As shown in fig. 2, the method for reconnaissance of enterprise information provided by the invention comprises the following steps:
expanding a data source based on initial enterprise client data, introducing business cooperation data, collecting website data of credit, vertical portals and enterprise news, and obtaining target enterprise data through data planning processing;
based on target enterprise data, integrating enterprise business opportunity information to construct a market space mining model; carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
and carrying out risk judgment and early warning according to the prediction result, obtaining the abnormal condition of enterprise operation, and outputting a visual abnormal result and an analysis report.
Further, the process of obtaining the target enterprise data through the data planning process includes,
based on enterprise client data, internet data sources are comprehensively expanded, website data such as credit, vertical portals and enterprise news are collected simultaneously by introducing business cooperation data, and a full-scale, unique and dynamic enterprise big data mart is formed through data planning processing. The method comprises the following steps:
step 1: data aggregation: the SQOOP component is utilized to import the data of the relational database into the HADOOP or HIVE through the SQOOP command. And the OOZIE component is adopted to process serial and parallel tasks imported by data, so that the dependency relationship between the tasks is solved.
Step 2: data cleaning and integration: when the Hadoop component, the Hive component and the SPARK component are adopted, dirty data containing useless information is subjected to consistency check, invalid value and missing value processing, repeated data processing and data standardization processing when the data is imported.
Step 3: and (3) data storage: and (3) the processed structured data are stored into the corresponding databases through interfaces of the relational databases such as SPARK call ORACLE, MYSQL and the like and the memory database REDIS.
Further optimizing the scheme, the Data storage of the implementation uses a private Data center to establish a Data Lake (Data Lake) and a Data Warehouse (Data Warehouse). Wherein the data lake is used to store raw, unprocessed data, including structured, semi-structured, and unstructured data. The data warehouse is used to store cleaned, converted, and consolidated data to support high performance batch analysis.
Step 4: designing a data service interface: the RESTful design principle is used to define the end points (Endpoints) of the API and the data interaction mode. The data format of the API is defined using JSON or XML, and the version number is included in the URL in consideration of API version control.
Further, the enterprise business condition analysis and business opportunity mining are carried out according to the market space mining model, and the process of obtaining the prediction result comprises,
creating database tables according to bidding information of imported enterprise data, designing a data model, determining the relation between the tables, and obtaining a market bidding information base;
based on a market bidding information base, building a market space big data mining model base by combining local network stock enterprise client information, and mining the development space of blue sea market, stock market and other network market government enterprise business;
after market space mining is carried out through the market space mining model, a user visually displays and exports the screened enterprise data and business matching report to a file according to the needs.
And further optimizing a scheme, based on enterprise client data, integrating enterprise business opportunity information, and constructing various types of market space mining models, so that enterprise business condition analysis is realized, and the purpose of mining business opportunities is achieved. The method comprises the following 3 steps:
step 1: and (5) building a market bidding information base.
Sub-step 1: bidding information for the imported enterprise data is provided, including item names, bidding parties, expiration dates, contract amounts, and the like.
Sub-step 2: a database table is created to store the bidding information.
Sub-step 3: and designing a data model, and determining the relation between tables.
Step 2: and carrying out business machine mining based on the market space mining model.
Based on the market bidding information base, a market space big data mining model base is built by combining the client information of the local network stock enterprises, and the development space of the blue sea market, the stock market and other network market government enterprise businesses is mined to energize the government enterprise business marketing.
1. Blue sea market space excavation model
(1) Business bid analysis model:
step 1: features associated with a specified business, such as project name, project description, release date, expiration date, bidding mechanism, location, etc., are extracted. When extracting, the text data is subjected to word segmentation, word frequency statistics, emotion analysis and other processes by using Natural Language Processing (NLP) technology so as to extract more information.
Step 2: service type screening: in the analysis parameter setting panel, a service type or item name keyword to be extracted is set.
Step 3: bid-bidding state screening: the bid issue time and the bid expiration time are set in the analysis parameter set panel.
Step 4: other feature screening: the characteristics of the area, the amount of money, the industry, the enterprise scale and the like can be selectively set in the analysis parameter setting panel.
Step 5: interference factor rejection: the content of the reject item can be selectively set in the analysis parameter setting panel, including area, amount, industry, enterprise scale and item keywords.
Step 6: and starting an automation algorithm preset by the platform according to the specific parameter setting condition by using the text classification model and the time sequence model to calculate so as to obtain a result.
(2) The gateway is connected with an enterprise model:
step 1: defining an association relation: and defining the association relationship among enterprises according to the analysis application requirements, wherein the association relationship comprises a stakeholder association relationship, a business cooperation relationship, a fund account relationship, an external investment relationship and a region relationship.
Step 2: configuring an associated element: and selecting a corresponding association relation analysis model through a parameter setting panel of the analysis model, and configuring association elements such as a stakeholder name, a business name, a region and the like.
Step 3: running an analysis model: and (3) using an SQL tool preset by a platform to correlate the enterprise with various relation analysis elements, and finding out enterprise pairs and enterprise groups with the same correlation factors.
Step 4: and storing the data result, and automatically storing the analyzed data in a data table so as to call, import and support the visual chart display at any time.
(3) Newly establishing an enterprise mining model:
step 1: date calculation: the difference between the date of registration of the business in the data lake and the current date is calculated to determine whether the business is registered during the specified time period. The time period may be set in a parameter setting panel.
Step 2: screening data: based on the calculated date difference, the enterprises with the registration time within the appointed time period are screened out.
Step 3: and storing the data result, and automatically storing the screened data in a data table so as to call, import and support visual chart display at any time.
2. Stock market space mining model
(1) Demand mining model (address change, branch opening, business demand change, recruitment increase)
Step 1: monitoring changes in demand analysis elements: real-time changes in data streams in data lakes are monitored and analyzed using real-time data stream processing techniques (Apache Kafka and Apache Flink). And (3) performing key monitoring on elements related to demand analysis, including enterprise address change, branch opening, credit investigation change, operation range change, recruitment demand change and the like.
Step 2: configuring association elements for an analytical model: and selecting a corresponding requirement mining analysis model through a parameter setting panel of the analysis model, and configuring related elements such as marketing business, branch setting time, a branch area, an operation range keyword, a recruitment post name, recruitment number and the like for the analysis model.
Step 3: running an analysis model: and (3) correlating the marketing business demand element with the enterprise correlation element by using an SQL tool preset by the platform, and finding out the enterprise matched with the marketing business demand element. When no marketing business is selected, the analytical model will identify requirements associated with new changes, which will relate to personnel, technology, resources, processes, markets, customers, etc.
Step 4: and storing the data result, and automatically storing the analyzed data in a data table so as to call, import and support the visual chart display at any time.
(2) Service upgrade model:
step 1: date calculation: and calculating the expiration time of the business transacted by the enterprises in the data lake.
Step 2: screening data: and based on the calculated expiration time, setting service expiration parameters through a setting panel, and screening enterprises of which the related service expiration time is within the expiration value.
Step 3: and storing the data result, and automatically storing the screened data in a data table so as to call, import and support visual chart display at any time.
Step 3: and outputting a target enterprise list and a business matching report.
After market space mining is carried out through the mining models, a user can select the screened enterprise data and service matching reports to carry out visual display and export to a file according to the needs.
Further, the risk judging and early warning process according to the prediction result comprises the steps of,
the platform carries out risk judgment and early warning on the management abnormality, judicial risk, telecom risk and reputation risk of the enterprise by comprehensively analyzing the information such as the basic information of the enterprise, the enterprise management data, the illegal records, the social security participation data, the tax owed data, the public opinion data and the like, so that a user can master the management abnormality of a target enterprise for the first time.
Step 1: monitoring changes in enterprise information: and (3) monitoring and analyzing the real-time change of the data flow in the data lake by using a real-time data flow processing technology (Apache Kafka and Apache Flink) to acquire change information of each dimension of the enterprise.
Further optimizing the scheme, the embodiment uses Apache Kafka and Apache Flink as a stream batch integrated processing engine to support real-time and batch data processing. Real-time data streams are captured, processed, and analyzed in real-time at the stream processing layer, supporting low-latency real-time decisions. Performing large-scale offline data analysis at the batch layer supports complex data mining and report generation.
Step 2: detection of negative information: the platform integrates a plurality of different algorithms such as keyword matching, topic modeling, word embedding, emotion analysis, rule engines, supervised learning and the like, integrates analysis results of all models, and improves identification accuracy of negative information.
Step 3: collection of negative information: the platform automatically selects the trained model to apply to the text data to be classified to categorize it into different negative information categories. The model assigns one or more category labels to each text sample, indicating the category to which it belongs.
Step 4: risk early warning: when the enterprise is at risk, the platform timely informs related personnel through the E-mail and the short message.
Step 5: visualization or output data: and according to the requirement, the risk analysis result can be selected to be visually displayed, and the risk analysis report can be exported.
The invention builds the enterprise big data mart around two application directions of business condition analysis and risk management, so that the reconnaissance platform has application capacities of business condition analysis, market space excavation, enterprise business opportunity excavation, enterprise risk monitoring and the like.
The foregoing is merely a preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions easily conceivable by those skilled in the art within the technical scope of the present application should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (10)
1. An enterprise information reconnaissance platform, comprising:
the data acquisition module is used for expanding a data source based on initial enterprise client data, introducing business cooperation data and simultaneously collecting website data of credit investigation, vertical portals and enterprise news;
the data processing module is connected with the data acquisition module and is used for carrying out data planning processing on the expansion data to obtain target enterprise data;
the model construction module is connected with the data processing module and used for fusing enterprise business opportunity information based on the target enterprise data to construct a market space mining model;
the model prediction module is connected with the model construction module and is used for carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
the risk prediction module is connected with the model prediction module and used for carrying out risk judgment and early warning according to the prediction result, obtaining abnormal conditions of enterprise operation and outputting a visual risk analysis result and an analysis report.
2. The enterprise information reconnaissance platform of claim 1,
the data processing module comprises a data aggregation unit, a data cleaning and integrating unit, a data storage unit and an interface design unit;
the data aggregation unit is used for importing the data of the relational database into the Hadoop or the HIVE through the SQOOP command by utilizing the SQOOP component; the system is also used for processing serial and parallel tasks of data import by adopting an OOZIE component;
when the data cleaning and integrating unit is used for importing data, a Hadoop component, a Hive component and a SPARK component are adopted to carry out consistency check, invalid value and missing value processing, repeated data processing and data standardization processing on dirty data containing useless information;
the data storage unit is used for storing the processed structured data into the corresponding database through interfaces of the SPARK calling ORACLE, the MYSQL relational database and the memory database REDIS;
the interface design unit is used for defining the endpoint and the data interaction mode of the API by using RESTful design principle and defining the data format of the API by using JSON or XML.
3. The enterprise information reconnaissance platform of claim 2,
the data storage unit comprises a data lake and a data warehouse;
the data lake is used for storing original and unprocessed data, including structured, semi-structured and unstructured data;
the data warehouse is used for storing data subjected to cleaning, conversion and arrangement so as to support high-performance batch analysis.
4. The enterprise information reconnaissance platform of claim 1,
the market space mining model comprises a blue sea market space mining model and an stock market space mining model;
the blue sea market space mining model comprises a business bid analysis model, a gateway enterprise model and a newly-built enterprise mining model;
the stock market space mining model comprises a demand mining model and a service upgrading model.
5. The enterprise information reconnaissance platform of claim 1,
the model prediction module comprises an information base construction unit, a business machine mining unit and a prediction unit;
the information base construction unit is used for creating database tables according to bidding information of imported enterprise data, then designing a data model, determining the relation between the tables and obtaining a market bidding information base;
the business machine mining unit is used for building a market space big data mining model library based on the market bidding information base and combining the client information of the local network stock enterprises, and mining the development space of blue sea market, stock market and other network market government enterprise business;
after market space mining, the prediction unit is used for visually displaying and exporting the screened enterprise data and business matching reports to a file according to the needs of a user.
6. The enterprise information reconnaissance platform of claim 1,
the risk prediction module comprises an information change monitoring unit, a negative information detection unit, a negative information collection unit, a risk early warning unit and a visual output unit;
the information change monitoring unit is used for monitoring and analyzing the real-time change of the data flow in the data lake by applying a real-time data flow processing technology and obtaining change information of each dimension of an enterprise;
the negative information detection unit is used for integrating different algorithms of keyword matching, topic modeling, word embedding, emotion analysis, rule engine and supervised learning to integrate analysis results of the market space mining model;
the negative information collecting unit is used for automatically selecting a trained model, applying the trained model to text data to be classified, and classifying the text data into different negative information categories;
the risk early warning unit is used for notifying related personnel through an email and a short message when the enterprise has risks;
the visual output unit is used for visually displaying the risk analysis result and exporting a risk analysis report.
7. The enterprise information reconnaissance method is characterized by comprising the following steps of:
expanding a data source based on initial enterprise client data, introducing business cooperation data, collecting website data of credit, vertical portals and enterprise news, and obtaining target enterprise data through data planning processing;
based on the target enterprise data, fusing enterprise business opportunity information and constructing a market space mining model; carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model to obtain a prediction result;
and carrying out risk judgment and early warning according to the prediction result, obtaining the abnormal condition of enterprise operation, and outputting a visual abnormal result and an analysis report.
8. The method of claim 7, wherein,
the process of obtaining target enterprise data through the data planning process includes,
the SQOOP component is utilized to import the data of the relational database into the HADOOP or HIVE through the SQOOP command, and the OOZIE component is utilized to process serial and parallel tasks of data import;
when data is imported, a Hadoop component, a Hive component and a SPARK component are adopted to carry out consistency check, invalid value and missing value processing, repeated data processing and data standardization processing on dirty data containing useless information;
the processed structured data is stored into a corresponding database through interfaces of a SPARK calling ORACLE, MYSQL and other relational databases and a memory database REDIS;
the data service interface is designed, the RESTful design principle is used for defining the end points and the data interaction modes of the API, and the JSON or XML is used for defining the data format of the API.
9. The method of claim 7, wherein,
carrying out enterprise business condition analysis and business opportunity mining according to the market space mining model, and obtaining the prediction result comprises the following steps of,
creating database tables according to bidding information of imported enterprise data, designing a data model, determining the relation between the tables, and obtaining a market bidding information base;
based on the market bidding information base, building a market space big data mining model base by combining the client information of the local network stock enterprises, and mining the development space of blue sea market, stock market and other network market government enterprise services;
after market space mining is carried out through the market space mining model, a user visually displays and exports the screened enterprise data and business matching report to a file according to the needs.
10. The method of claim 7, wherein,
the risk judging and early warning process according to the prediction result comprises,
the method comprises the steps of applying a real-time data stream processing technology to monitor and analyze real-time change of data streams in a data lake and obtaining change information of each dimension of an enterprise;
integrating different algorithms of keyword matching, topic modeling, word embedding, emotion analysis, rule engine and supervised learning, and integrating analysis results of the market space mining model;
automatically selecting a trained model, applying the trained module to text data to be classified, and classifying the text data into different negative information categories;
when the enterprise has risks, notifying related personnel through emails and short messages;
and visually displaying the risk analysis result as required, and exporting a risk analysis report.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311322724.3A CN117391440A (en) | 2023-10-12 | 2023-10-12 | Enterprise information reconnaissance platform and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311322724.3A CN117391440A (en) | 2023-10-12 | 2023-10-12 | Enterprise information reconnaissance platform and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117391440A true CN117391440A (en) | 2024-01-12 |
Family
ID=89469473
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311322724.3A Pending CN117391440A (en) | 2023-10-12 | 2023-10-12 | Enterprise information reconnaissance platform and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117391440A (en) |
-
2023
- 2023-10-12 CN CN202311322724.3A patent/CN117391440A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Industrial big data analytics: challenges, methodologies, and applications | |
CN108572967B (en) | Method and device for creating enterprise portrait | |
CN109597936B (en) | New user screening system and method | |
US20140012800A1 (en) | Apparatus and method for providing application for processing big data | |
CN112181960B (en) | Intelligent operation and maintenance framework system based on AIOps | |
CN115423289B (en) | Intelligent plate processing workshop data processing method and terminal | |
CN110851667A (en) | Integrated analysis method and tool for multi-source large data | |
CN114880405A (en) | Data lake-based data processing method and system | |
CN111062600A (en) | Model evaluation method, system, electronic device, and computer-readable storage medium | |
Islam et al. | A framework for effective big data analytics for decision support systems | |
CN117391440A (en) | Enterprise information reconnaissance platform and method | |
Kohli et al. | Big Data Analytics: An Overview | |
CN114116667A (en) | Data management system for power data application scene | |
CN113886465A (en) | Big data analysis platform for automobile logistics | |
CN114723548A (en) | Data processing method, apparatus, device, medium, and program product | |
Yang | Multivariate statistical methods and Six-Sigma | |
Grambau et al. | Reference Architecture framework for enhanced social media data analytics for Predictive Maintenance models | |
Stubarev et al. | Development of the analytical platform for CRM-system | |
CN110689241A (en) | Power grid physical asset evaluation system based on big data | |
CN111612302A (en) | Group-level data management method and equipment | |
Bai | The application of customer relationship management and data mining in Chinese insurance companies | |
Han et al. | Logistics Supply Chain Management Mode of Chinese E-Commerce Enterprises under the Background of Big Data and Internet of Things | |
Fu et al. | Management of Power Marketing Audit Work Based on Tobit Model and Big Data Technology | |
US20230015637A1 (en) | Method and System for Analyzing Data in a Database | |
CN113934769A (en) | Intelligent data analysis method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |