CN112837199A - Method for establishing big data service platform of small and medium-sized micro-enterprises - Google Patents
Method for establishing big data service platform of small and medium-sized micro-enterprises Download PDFInfo
- Publication number
- CN112837199A CN112837199A CN202110210402.4A CN202110210402A CN112837199A CN 112837199 A CN112837199 A CN 112837199A CN 202110210402 A CN202110210402 A CN 202110210402A CN 112837199 A CN112837199 A CN 112837199A
- Authority
- CN
- China
- Prior art keywords
- data
- training
- establishing
- enterprise
- medium
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000004458 analytical method Methods 0.000 claims abstract description 19
- 230000002776 aggregation Effects 0.000 claims abstract description 10
- 238000004220 aggregation Methods 0.000 claims abstract description 10
- 238000012549 training Methods 0.000 claims description 39
- 238000004364 calculation method Methods 0.000 claims description 15
- 230000002452 interceptive effect Effects 0.000 claims description 9
- 230000000694 effects Effects 0.000 claims description 8
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000002372 labelling Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 3
- 230000002844 continuous effect Effects 0.000 claims description 3
- 125000004122 cyclic group Chemical group 0.000 claims description 3
- 238000007405 data analysis Methods 0.000 claims description 3
- 238000007689 inspection Methods 0.000 claims description 3
- 238000010801 machine learning Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000007619 statistical method Methods 0.000 claims description 3
- 239000013589 supplement Substances 0.000 claims description 3
- 230000008859 change Effects 0.000 description 3
- 238000005065 mining Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000035515 penetration Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Abstract
The invention relates to a method for establishing a big data service platform of a small and medium-sized micro enterprise, which comprises the following steps: the method comprises the following steps: enterprise data are collected, and a special subject database is formed through data aggregation; step two: according to the data condition of the enterprise, combining the special database formed in the step one, and establishing a label system for the target enterprise; step three: and establishing an enterprise industrial chain analysis platform.
Description
Technical Field
The invention relates to the technical field of enterprise service platforms, in particular to a method for establishing a big data service platform of small and medium-sized micro enterprises.
Background
At present, government departments do not use means such as big data and the like to collect scattered enterprise public credit information, lack data support, and cannot fully master the registration, operation, withdrawal and credit risk changes of small and medium-sized enterprises in key industrial chains in areas, the enterprise financial activity range and financing conditions in areas, so that the industrial chain related party distribution changes, the operation activity area changes and the financial demand changes of all enterprises in areas need to be tracked and mastered in real time by building a cloud platform to collect the enterprise public credit information in areas and the registration, credit granting, default and other conditions of financing enterprises through the cloud platform, and the government and related departments can conveniently make decision bases for developing industry support policies and credit support policies of small and medium-sized enterprises based on big data of credit.
Disclosure of Invention
Aiming at the defects of the prior art, the technical problem to be solved by the patent application is how to provide a method for establishing a big data service platform of small and medium-sized micro enterprises, and the intuitive, convenient and efficient enterprise service platform is realized.
In order to solve the technical problems, the invention adopts the following technical scheme:
a method for establishing a big data service platform of a small and medium-sized micro enterprise comprises the following steps:
the method comprises the following steps: enterprise data are collected, and a special subject database is formed through data aggregation;
step two: according to the data condition of the enterprise, combining the special database formed in the step one, and establishing a label system for the target enterprise;
step three: and establishing an enterprise industrial chain analysis platform.
Further, in the first step, the collection of enterprise data is based on government public credit information and market credit information, wherein a plurality of indexes including global indexes, local indexes, high-dimensional indexes and full-text indexes are supported.
Further, in the first step, when data aggregation is performed, data are extracted from each message source and aggregated into a target model based on a Spark calculation engine technology, the Spark calculation engine provides interactive data processing capability by combining a distributed memory and column storage, the Spark calculation engine provides the concept of RDD, and all statistical analysis tasks are composed of a plurality of basic operations on RDD. RDD is stored in a memory, subsequent tasks directly read data in the memory, a Spark calculation engine compiles analysis tasks into a directed acyclic graph formed by the RDD, and adjacent tasks are combined according to the dependency between the data.
Further, in the first step, the Spark calculation engine simultaneously supports distributed mixed column type storage across memory/flash media, and supports interactive data analysis on mass data by loading data into distributed memory column type storage of the analytical database.
Further, in the second step, the establishment of the enterprise tag system comprises the following steps:
s1: carrying out characteristic attribute construction on the maximum attribute set obtained through data aggregation;
s2: putting the related training set into a classifier for training;
s3: and performing labeling display.
Furthermore, the feature attributes can be subjected to unsupervised learning calculation classification labels in the establishment process of the enterprise label system, and the label system can be used for quick retrieval and used as supplement of labeling display.
Further, in step s2, the training is performed by:
a1: introducing a training set and data;
defining by training personnel according to data of different subjects, different rules and different sources, and specifying a training data set, a training range and a training data volume to complete initialization data configuration of label training;
a2: label training and label comparison;
and generating related labels according to the algorithm and the model, comparing the related labels with the artificial labels, interfering the increase or decrease of the labels in the training process, and recognizing the features and improving the continuous effect in a machine learning mode. Meanwhile, relevant certainty rules are formulated to generate corresponding labels;
a3: performing cyclic training;
performing circulation training of multi-theme, multi-dimension and multi-source data through an a2 method, and allowing adjustment of rules and corresponding algorithm parameters;
a4: effect inspection;
the actual effect of the label generation is checked by introducing new data.
Further, in the third step, when an enterprise industry chain analysis platform is established, analysis is performed from the perspective of the association relation of the industry chain where the main body is located based on the knowledge graph technology, an evolution subgraph structure is discovered through a frequent subgraph discovery technical means, the evolution subgraph structure is mapped to a corresponding analysis dimension, the analysis dimension is expanded, and identification and tracking of dynamic evolution of the industry chain are achieved from a new perspective.
In summary, the invention has an application of multistage penetration of an industrial chain, and based on two theories (U-shaped development/share right principle), the industrial chain distribution and risk trend of an enterprise and an associated party are analyzed and the real-time tracking of the industrial chain change trend in an area is verified by carrying out stage-by-stage deep mining on the information such as industrial and commercial data, judicial information and the like of an industrial chain enterprise and verifying the real-time tracking of the industrial chain change trend in the area through stage-by-stage deep mining, region distribution trend, region risk rate trend, associated party property right distribution, associated party classification distribution, associated party transaction distribution, management layer distribution, associated party litigation distribution, litigation classification, industry distribution trend, industry risk rate trend and the like.
Drawings
Fig. 1 is a flowchart of a method for establishing a big data service platform for small and medium-sized micro enterprises according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings. In the description of the present invention, it is to be understood that the orientation or positional relationship indicated by the orientation words such as "upper" and "lower" and "top" and "bottom" and the like are generally based on the orientation or positional relationship shown in the drawings, and are only for convenience of description and simplicity of description, and in the case of not making a reverse description, these orientation words do not indicate and imply that the device or element referred to must have a specific orientation or be constructed and operated in a specific orientation, and therefore, should not be taken as limiting the scope of the present invention; the terms "inner and outer" refer to the inner and outer relative to the profile of the respective component itself.
As shown in fig. 1, a method for establishing a big data service platform of a small and medium-sized micro enterprise includes the following steps:
the method comprises the following steps: enterprise data are collected, and a special subject database is formed through data aggregation;
step two: according to the data condition of the enterprise, combining the special database formed in the step one, and establishing a label system for the target enterprise;
step three: and establishing an enterprise industrial chain analysis platform.
Further, in the first step, the collection of enterprise data is based on government public credit information and market credit information, wherein a plurality of indexes including global indexes, local indexes, high-dimensional indexes and full-text indexes are supported.
Further, in the first step, when data aggregation is performed, data are extracted from each message source and aggregated into a target model based on a Spark calculation engine technology, the Spark calculation engine provides interactive data processing capability by combining a distributed memory and column storage, the Spark calculation engine provides the concept of RDD, and all statistical analysis tasks are composed of a plurality of basic operations on RDD. RDD is stored in a memory, subsequent tasks directly read data in the memory, a Spark calculation engine compiles analysis tasks into a directed acyclic graph formed by the RDD, and adjacent tasks are combined according to the dependency between the data.
Further, in the first step, the Spark calculation engine simultaneously supports distributed mixed column type storage across memory/flash media, and supports interactive data analysis on mass data by loading data into distributed memory column type storage of the analytical database.
Further, in the second step, the establishment of the enterprise tag system comprises the following steps:
s1: carrying out characteristic attribute construction on the maximum attribute set obtained through data aggregation;
s2: putting the related training set into a classifier for training;
s3: and performing labeling display.
Furthermore, the feature attributes can be subjected to unsupervised learning calculation classification labels in the establishment process of the enterprise label system, and the label system can be used for quick retrieval and used as supplement of labeling display.
Further, in step s2, the training is performed by:
a1: introducing a training set and data;
defining by training personnel according to data of different subjects, different rules and different sources, and specifying a training data set, a training range and a training data volume to complete initialization data configuration of label training;
a2: label training and label comparison;
and generating related labels according to the algorithm and the model, comparing the related labels with the artificial labels, interfering the increase or decrease of the labels in the training process, and recognizing the features and improving the continuous effect in a machine learning mode. Meanwhile, relevant certainty rules are formulated to generate corresponding labels;
a3: performing cyclic training;
performing circulation training of multi-theme, multi-dimension and multi-source data through an a2 method, and allowing adjustment of rules and corresponding algorithm parameters;
a4: effect inspection;
the actual effect of the label generation is checked by introducing new data.
Further, in the third step, when an enterprise industry chain analysis platform is established, analysis is performed from the perspective of the association relation of the industry chain where the main body is located based on the knowledge graph technology, an evolution subgraph structure is discovered through a frequent subgraph discovery technical means, the evolution subgraph structure is mapped to a corresponding analysis dimension, the analysis dimension is expanded, and identification and tracking of dynamic evolution of the industry chain are achieved from a new perspective.
The method has the advantages that the multistage penetration application of the industrial chain can be realized, the industrial chain distribution and risk trend of the enterprise and the related parties are analyzed based on two theories (U-shaped development/share right principle) by starting from industrial and commercial data, judicial information and other information of industrial chain enterprises and by carrying out stage-by-stage deep mining on the information of the industrial and commercial data, the regional risk rate trend, the associated party property right distribution, the associated party classification distribution, the associated party transaction distribution, the management layer distribution, the associated party litigation distribution, the litigation classification, the industrial chain distribution and the risk trend of the enterprise and the related parties and the like, and the real-time tracking of the industrial chain change trend in the region is verified
Finally, it should be noted that: various modifications and alterations of this invention may be made by those skilled in the art without departing from the spirit and scope of this invention. Thus, it is intended that the present invention cover the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents.
Claims (8)
1. A method for establishing a big data service platform of small and medium-sized micro enterprises is characterized by comprising the following steps:
the method comprises the following steps: enterprise data are collected, and a special subject database is formed through data aggregation;
step two: according to the data condition of the enterprise, combining the special database formed in the step one, and establishing a label system for the target enterprise;
step three: and establishing an enterprise industrial chain analysis platform.
2. The method for establishing the big data service platform of the small and medium-sized micro enterprises as claimed in claim 1, wherein in the first step, the collection of the enterprise data is based on government public credit information and market credit information, wherein a plurality of indexes including global indexes, local indexes, high-dimensional indexes and full-text indexes are supported.
3. The method for establishing the big data service platform of the small and medium-sized micro-enterprises as claimed in claim 2, wherein in the step one, during data aggregation, data are extracted from all message sources and aggregated into a target model based on Spark computing engine technology, Spark computing engine provides interactive data processing capability by combining distributed memory and column storage, Spark computing engine provides concept of RDD, and all statistical analysis tasks are composed of a plurality of basic operations on RDD. RDD is stored in a memory, subsequent tasks directly read data in the memory, a Spark calculation engine compiles analysis tasks into a directed acyclic graph formed by the RDD, and adjacent tasks are combined according to the dependency between the data.
4. The method for establishing the big data service platform of the small and medium-sized micro-enterprises as claimed in claim 1, wherein in the first step, the Spark calculation engine simultaneously supports distributed hybrid column-type storage across memory/flash media, and supports interactive data analysis of mass data by loading data into the distributed memory column-type storage of the analytic database.
5. The method for establishing the big data service platform of the small and medium-sized micro enterprises according to claim 1, wherein in the second step, establishing the enterprise tag system comprises the following steps:
s1: carrying out characteristic attribute construction on the maximum attribute set obtained through data aggregation;
s2: putting the related training set into a classifier for training;
s3: and performing labeling display.
6. The method for establishing the big data service platform of the small and medium-sized micro enterprises as claimed in claim 1, wherein the feature attributes can be subjected to unsupervised learning, calculation and classification tags in the establishment process of the enterprise tag system, and the tags can be used for quick retrieval and used as a supplement for tagged display.
7. The method for establishing the big data service platform of the small and medium-sized micro-enterprises as claimed in claim 1, wherein in the step s2, the training is performed by:
a1: introducing a training set and data;
defining by training personnel according to data of different subjects, different rules and different sources, and specifying a training data set, a training range and a training data volume to complete initialization data configuration of label training;
a2: label training and label comparison;
and generating related labels according to the algorithm and the model, comparing the related labels with the artificial labels, interfering the increase or decrease of the labels in the training process, and recognizing the features and improving the continuous effect in a machine learning mode. Meanwhile, relevant certainty rules are formulated to generate corresponding labels;
a3: performing cyclic training;
performing circulation training of multi-theme, multi-dimension and multi-source data through an a2 method, and allowing adjustment of rules and corresponding algorithm parameters;
a4: effect inspection;
the actual effect of the label generation is checked by introducing new data.
8. The method for establishing the big data service platform for the small and medium-sized micro enterprises according to claim 1, wherein in the third step, when the enterprise industry chain analysis platform is established, the analysis is performed from the perspective of the association relationship of the industry chain where the main body is located based on the knowledge graph technology, the evolution subgraph structure is discovered through a frequent subgraph discovery technical means, and is mapped to the corresponding analysis dimension, the analysis dimension is expanded, and the identification and tracking of the dynamic evolution of the industry chain are realized from a new perspective.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110210402.4A CN112837199A (en) | 2021-02-25 | 2021-02-25 | Method for establishing big data service platform of small and medium-sized micro-enterprises |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110210402.4A CN112837199A (en) | 2021-02-25 | 2021-02-25 | Method for establishing big data service platform of small and medium-sized micro-enterprises |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112837199A true CN112837199A (en) | 2021-05-25 |
Family
ID=75933347
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110210402.4A Pending CN112837199A (en) | 2021-02-25 | 2021-02-25 | Method for establishing big data service platform of small and medium-sized micro-enterprises |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112837199A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109272155A (en) * | 2018-09-11 | 2019-01-25 | 郑州向心力通信技术股份有限公司 | A kind of corporate behavior analysis system based on big data |
CN109739820A (en) * | 2018-12-29 | 2019-05-10 | 科技谷(厦门)信息技术有限公司 | A kind of E-government information service system based on big data analysis |
CN109993644A (en) * | 2017-12-29 | 2019-07-09 | 航天信息股份有限公司 | A kind of portrait determines method, apparatus, electronic equipment and storage medium |
CN110489560A (en) * | 2019-06-19 | 2019-11-22 | 民生科技有限责任公司 | The little Wei enterprise portrait generation method and device of knowledge based graphical spectrum technology |
CN110796470A (en) * | 2019-08-13 | 2020-02-14 | 广州中国科学院软件应用技术研究所 | Market subject supervision and service oriented data analysis system |
CN112115277A (en) * | 2020-09-28 | 2020-12-22 | 中国建设银行股份有限公司 | Knowledge graph-based integrated circuit industrial chain identification method and system |
CN112131275A (en) * | 2020-09-23 | 2020-12-25 | 中国科学技术大学智慧城市研究院(芜湖) | Enterprise portrait construction method of holographic city big data model and knowledge graph |
CN112182246A (en) * | 2020-09-28 | 2021-01-05 | 上海市浦东新区行政服务中心(上海市浦东新区市民中心) | Method, system, medium, and application for creating an enterprise representation through big data analysis |
-
2021
- 2021-02-25 CN CN202110210402.4A patent/CN112837199A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109993644A (en) * | 2017-12-29 | 2019-07-09 | 航天信息股份有限公司 | A kind of portrait determines method, apparatus, electronic equipment and storage medium |
CN109272155A (en) * | 2018-09-11 | 2019-01-25 | 郑州向心力通信技术股份有限公司 | A kind of corporate behavior analysis system based on big data |
CN109739820A (en) * | 2018-12-29 | 2019-05-10 | 科技谷(厦门)信息技术有限公司 | A kind of E-government information service system based on big data analysis |
CN110489560A (en) * | 2019-06-19 | 2019-11-22 | 民生科技有限责任公司 | The little Wei enterprise portrait generation method and device of knowledge based graphical spectrum technology |
CN110796470A (en) * | 2019-08-13 | 2020-02-14 | 广州中国科学院软件应用技术研究所 | Market subject supervision and service oriented data analysis system |
CN112131275A (en) * | 2020-09-23 | 2020-12-25 | 中国科学技术大学智慧城市研究院(芜湖) | Enterprise portrait construction method of holographic city big data model and knowledge graph |
CN112115277A (en) * | 2020-09-28 | 2020-12-22 | 中国建设银行股份有限公司 | Knowledge graph-based integrated circuit industrial chain identification method and system |
CN112182246A (en) * | 2020-09-28 | 2021-01-05 | 上海市浦东新区行政服务中心(上海市浦东新区市民中心) | Method, system, medium, and application for creating an enterprise representation through big data analysis |
Non-Patent Citations (1)
Title |
---|
田娟;朱定局;杨文翰;: "基于大数据平台的企业画像研究综述", 计算机科学, no. 2, 15 November 2018 (2018-11-15), pages 68 - 72 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tiwari et al. | Big data analytics in supply chain management between 2010 and 2016: Insights to industries | |
US20190340518A1 (en) | Systems and methods for enriching modeling tools and infrastructure with semantics | |
US20210271809A1 (en) | Machine learning process implementation method and apparatus, device, and storage medium | |
Sartori et al. | Bankruptcy forecasting using case-based reasoning: The CRePERIE approach | |
Chang et al. | Identification of the technology life cycle of telematics: A patent-based analytical perspective | |
CN112182246B (en) | Method, system, medium, and application for creating an enterprise representation through big data analysis | |
Clinchant et al. | Comparing machine learning approaches for table recognition in historical register books | |
CN110109908B (en) | Analysis system and method for mining potential relationship of person based on social basic information | |
CN113626607B (en) | Abnormal work order identification method and device, electronic equipment and readable storage medium | |
Jahani et al. | Data science and big data analytics: A systematic review of methodologies used in the supply chain and logistics research | |
Tinelli et al. | Embedding semantics in human resources management automation via SQL | |
Chen et al. | Exploring technology opportunities and evolution of IoT-related logistics services with text mining | |
CN110033191B (en) | Business artificial intelligence analysis method and system | |
Ren et al. | An effective similarity determination model for case-based reasoning in support of low-carbon product design | |
Pawar et al. | Big data analytics in logistics and supply chain management: a review of literature | |
Ferranti et al. | A framework for evaluating ontology meta-matching approaches | |
Huerta et al. | Data mining: Application of digital marketing in education | |
Bella et al. | Semi-supervised approach for recovering traceability links in complex systems | |
CN112363996A (en) | Method, system, and medium for building a physical model of a power grid knowledge graph | |
CN112699245A (en) | Construction method and device and application method and device of budget management knowledge graph | |
CN112837199A (en) | Method for establishing big data service platform of small and medium-sized micro-enterprises | |
Fan et al. | Spatially enabled customer segmentation using a data classification method with uncertain predicates | |
Paul et al. | Big Data Analytics for Marketing Intelligence | |
CN112102006A (en) | Target customer acquisition method, target customer search method and target customer search device based on big data analysis | |
AU2021103329A4 (en) | The investigation technique of object using machine learning and system. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |