CN112837199A - Method for establishing big data service platform of small and medium-sized micro-enterprises - Google Patents

Method for establishing big data service platform of small and medium-sized micro-enterprises Download PDF

Info

Publication number
CN112837199A
CN112837199A CN202110210402.4A CN202110210402A CN112837199A CN 112837199 A CN112837199 A CN 112837199A CN 202110210402 A CN202110210402 A CN 202110210402A CN 112837199 A CN112837199 A CN 112837199A
Authority
CN
China
Prior art keywords
data
training
establishing
enterprise
medium
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110210402.4A
Other languages
Chinese (zh)
Inventor
杨春林
张磊
廖敏
贺本静
黎巡巡
姜亚兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Shulian Mingxin Technology Co ltd
Original Assignee
Chongqing Shulian Mingxin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Shulian Mingxin Technology Co ltd filed Critical Chongqing Shulian Mingxin Technology Co ltd
Priority to CN202110210402.4A priority Critical patent/CN112837199A/en
Publication of CN112837199A publication Critical patent/CN112837199A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Abstract

The invention relates to a method for establishing a big data service platform of a small and medium-sized micro enterprise, which comprises the following steps: the method comprises the following steps: enterprise data are collected, and a special subject database is formed through data aggregation; step two: according to the data condition of the enterprise, combining the special database formed in the step one, and establishing a label system for the target enterprise; step three: and establishing an enterprise industrial chain analysis platform.

Description

Method for establishing big data service platform of small and medium-sized micro-enterprises
Technical Field
The invention relates to the technical field of enterprise service platforms, in particular to a method for establishing a big data service platform of small and medium-sized micro enterprises.
Background
At present, government departments do not use means such as big data and the like to collect scattered enterprise public credit information, lack data support, and cannot fully master the registration, operation, withdrawal and credit risk changes of small and medium-sized enterprises in key industrial chains in areas, the enterprise financial activity range and financing conditions in areas, so that the industrial chain related party distribution changes, the operation activity area changes and the financial demand changes of all enterprises in areas need to be tracked and mastered in real time by building a cloud platform to collect the enterprise public credit information in areas and the registration, credit granting, default and other conditions of financing enterprises through the cloud platform, and the government and related departments can conveniently make decision bases for developing industry support policies and credit support policies of small and medium-sized enterprises based on big data of credit.
Disclosure of Invention
Aiming at the defects of the prior art, the technical problem to be solved by the patent application is how to provide a method for establishing a big data service platform of small and medium-sized micro enterprises, and the intuitive, convenient and efficient enterprise service platform is realized.
In order to solve the technical problems, the invention adopts the following technical scheme:
a method for establishing a big data service platform of a small and medium-sized micro enterprise comprises the following steps:
the method comprises the following steps: enterprise data are collected, and a special subject database is formed through data aggregation;
step two: according to the data condition of the enterprise, combining the special database formed in the step one, and establishing a label system for the target enterprise;
step three: and establishing an enterprise industrial chain analysis platform.
Further, in the first step, the collection of enterprise data is based on government public credit information and market credit information, wherein a plurality of indexes including global indexes, local indexes, high-dimensional indexes and full-text indexes are supported.
Further, in the first step, when data aggregation is performed, data are extracted from each message source and aggregated into a target model based on a Spark calculation engine technology, the Spark calculation engine provides interactive data processing capability by combining a distributed memory and column storage, the Spark calculation engine provides the concept of RDD, and all statistical analysis tasks are composed of a plurality of basic operations on RDD. RDD is stored in a memory, subsequent tasks directly read data in the memory, a Spark calculation engine compiles analysis tasks into a directed acyclic graph formed by the RDD, and adjacent tasks are combined according to the dependency between the data.
Further, in the first step, the Spark calculation engine simultaneously supports distributed mixed column type storage across memory/flash media, and supports interactive data analysis on mass data by loading data into distributed memory column type storage of the analytical database.
Further, in the second step, the establishment of the enterprise tag system comprises the following steps:
s1: carrying out characteristic attribute construction on the maximum attribute set obtained through data aggregation;
s2: putting the related training set into a classifier for training;
s3: and performing labeling display.
Furthermore, the feature attributes can be subjected to unsupervised learning calculation classification labels in the establishment process of the enterprise label system, and the label system can be used for quick retrieval and used as supplement of labeling display.
Further, in step s2, the training is performed by:
a1: introducing a training set and data;
defining by training personnel according to data of different subjects, different rules and different sources, and specifying a training data set, a training range and a training data volume to complete initialization data configuration of label training;
a2: label training and label comparison;
and generating related labels according to the algorithm and the model, comparing the related labels with the artificial labels, interfering the increase or decrease of the labels in the training process, and recognizing the features and improving the continuous effect in a machine learning mode. Meanwhile, relevant certainty rules are formulated to generate corresponding labels;
a3: performing cyclic training;
performing circulation training of multi-theme, multi-dimension and multi-source data through an a2 method, and allowing adjustment of rules and corresponding algorithm parameters;
a4: effect inspection;
the actual effect of the label generation is checked by introducing new data.
Further, in the third step, when an enterprise industry chain analysis platform is established, analysis is performed from the perspective of the association relation of the industry chain where the main body is located based on the knowledge graph technology, an evolution subgraph structure is discovered through a frequent subgraph discovery technical means, the evolution subgraph structure is mapped to a corresponding analysis dimension, the analysis dimension is expanded, and identification and tracking of dynamic evolution of the industry chain are achieved from a new perspective.
In summary, the invention has an application of multistage penetration of an industrial chain, and based on two theories (U-shaped development/share right principle), the industrial chain distribution and risk trend of an enterprise and an associated party are analyzed and the real-time tracking of the industrial chain change trend in an area is verified by carrying out stage-by-stage deep mining on the information such as industrial and commercial data, judicial information and the like of an industrial chain enterprise and verifying the real-time tracking of the industrial chain change trend in the area through stage-by-stage deep mining, region distribution trend, region risk rate trend, associated party property right distribution, associated party classification distribution, associated party transaction distribution, management layer distribution, associated party litigation distribution, litigation classification, industry distribution trend, industry risk rate trend and the like.
Drawings
Fig. 1 is a flowchart of a method for establishing a big data service platform for small and medium-sized micro enterprises according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings. In the description of the present invention, it is to be understood that the orientation or positional relationship indicated by the orientation words such as "upper" and "lower" and "top" and "bottom" and the like are generally based on the orientation or positional relationship shown in the drawings, and are only for convenience of description and simplicity of description, and in the case of not making a reverse description, these orientation words do not indicate and imply that the device or element referred to must have a specific orientation or be constructed and operated in a specific orientation, and therefore, should not be taken as limiting the scope of the present invention; the terms "inner and outer" refer to the inner and outer relative to the profile of the respective component itself.
As shown in fig. 1, a method for establishing a big data service platform of a small and medium-sized micro enterprise includes the following steps:
the method comprises the following steps: enterprise data are collected, and a special subject database is formed through data aggregation;
step two: according to the data condition of the enterprise, combining the special database formed in the step one, and establishing a label system for the target enterprise;
step three: and establishing an enterprise industrial chain analysis platform.
Further, in the first step, the collection of enterprise data is based on government public credit information and market credit information, wherein a plurality of indexes including global indexes, local indexes, high-dimensional indexes and full-text indexes are supported.
Further, in the first step, when data aggregation is performed, data are extracted from each message source and aggregated into a target model based on a Spark calculation engine technology, the Spark calculation engine provides interactive data processing capability by combining a distributed memory and column storage, the Spark calculation engine provides the concept of RDD, and all statistical analysis tasks are composed of a plurality of basic operations on RDD. RDD is stored in a memory, subsequent tasks directly read data in the memory, a Spark calculation engine compiles analysis tasks into a directed acyclic graph formed by the RDD, and adjacent tasks are combined according to the dependency between the data.
Further, in the first step, the Spark calculation engine simultaneously supports distributed mixed column type storage across memory/flash media, and supports interactive data analysis on mass data by loading data into distributed memory column type storage of the analytical database.
Further, in the second step, the establishment of the enterprise tag system comprises the following steps:
s1: carrying out characteristic attribute construction on the maximum attribute set obtained through data aggregation;
s2: putting the related training set into a classifier for training;
s3: and performing labeling display.
Furthermore, the feature attributes can be subjected to unsupervised learning calculation classification labels in the establishment process of the enterprise label system, and the label system can be used for quick retrieval and used as supplement of labeling display.
Further, in step s2, the training is performed by:
a1: introducing a training set and data;
defining by training personnel according to data of different subjects, different rules and different sources, and specifying a training data set, a training range and a training data volume to complete initialization data configuration of label training;
a2: label training and label comparison;
and generating related labels according to the algorithm and the model, comparing the related labels with the artificial labels, interfering the increase or decrease of the labels in the training process, and recognizing the features and improving the continuous effect in a machine learning mode. Meanwhile, relevant certainty rules are formulated to generate corresponding labels;
a3: performing cyclic training;
performing circulation training of multi-theme, multi-dimension and multi-source data through an a2 method, and allowing adjustment of rules and corresponding algorithm parameters;
a4: effect inspection;
the actual effect of the label generation is checked by introducing new data.
Further, in the third step, when an enterprise industry chain analysis platform is established, analysis is performed from the perspective of the association relation of the industry chain where the main body is located based on the knowledge graph technology, an evolution subgraph structure is discovered through a frequent subgraph discovery technical means, the evolution subgraph structure is mapped to a corresponding analysis dimension, the analysis dimension is expanded, and identification and tracking of dynamic evolution of the industry chain are achieved from a new perspective.
The method has the advantages that the multistage penetration application of the industrial chain can be realized, the industrial chain distribution and risk trend of the enterprise and the related parties are analyzed based on two theories (U-shaped development/share right principle) by starting from industrial and commercial data, judicial information and other information of industrial chain enterprises and by carrying out stage-by-stage deep mining on the information of the industrial and commercial data, the regional risk rate trend, the associated party property right distribution, the associated party classification distribution, the associated party transaction distribution, the management layer distribution, the associated party litigation distribution, the litigation classification, the industrial chain distribution and the risk trend of the enterprise and the related parties and the like, and the real-time tracking of the industrial chain change trend in the region is verified
Finally, it should be noted that: various modifications and alterations of this invention may be made by those skilled in the art without departing from the spirit and scope of this invention. Thus, it is intended that the present invention cover the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents.

Claims (8)

1. A method for establishing a big data service platform of small and medium-sized micro enterprises is characterized by comprising the following steps:
the method comprises the following steps: enterprise data are collected, and a special subject database is formed through data aggregation;
step two: according to the data condition of the enterprise, combining the special database formed in the step one, and establishing a label system for the target enterprise;
step three: and establishing an enterprise industrial chain analysis platform.
2. The method for establishing the big data service platform of the small and medium-sized micro enterprises as claimed in claim 1, wherein in the first step, the collection of the enterprise data is based on government public credit information and market credit information, wherein a plurality of indexes including global indexes, local indexes, high-dimensional indexes and full-text indexes are supported.
3. The method for establishing the big data service platform of the small and medium-sized micro-enterprises as claimed in claim 2, wherein in the step one, during data aggregation, data are extracted from all message sources and aggregated into a target model based on Spark computing engine technology, Spark computing engine provides interactive data processing capability by combining distributed memory and column storage, Spark computing engine provides concept of RDD, and all statistical analysis tasks are composed of a plurality of basic operations on RDD. RDD is stored in a memory, subsequent tasks directly read data in the memory, a Spark calculation engine compiles analysis tasks into a directed acyclic graph formed by the RDD, and adjacent tasks are combined according to the dependency between the data.
4. The method for establishing the big data service platform of the small and medium-sized micro-enterprises as claimed in claim 1, wherein in the first step, the Spark calculation engine simultaneously supports distributed hybrid column-type storage across memory/flash media, and supports interactive data analysis of mass data by loading data into the distributed memory column-type storage of the analytic database.
5. The method for establishing the big data service platform of the small and medium-sized micro enterprises according to claim 1, wherein in the second step, establishing the enterprise tag system comprises the following steps:
s1: carrying out characteristic attribute construction on the maximum attribute set obtained through data aggregation;
s2: putting the related training set into a classifier for training;
s3: and performing labeling display.
6. The method for establishing the big data service platform of the small and medium-sized micro enterprises as claimed in claim 1, wherein the feature attributes can be subjected to unsupervised learning, calculation and classification tags in the establishment process of the enterprise tag system, and the tags can be used for quick retrieval and used as a supplement for tagged display.
7. The method for establishing the big data service platform of the small and medium-sized micro-enterprises as claimed in claim 1, wherein in the step s2, the training is performed by:
a1: introducing a training set and data;
defining by training personnel according to data of different subjects, different rules and different sources, and specifying a training data set, a training range and a training data volume to complete initialization data configuration of label training;
a2: label training and label comparison;
and generating related labels according to the algorithm and the model, comparing the related labels with the artificial labels, interfering the increase or decrease of the labels in the training process, and recognizing the features and improving the continuous effect in a machine learning mode. Meanwhile, relevant certainty rules are formulated to generate corresponding labels;
a3: performing cyclic training;
performing circulation training of multi-theme, multi-dimension and multi-source data through an a2 method, and allowing adjustment of rules and corresponding algorithm parameters;
a4: effect inspection;
the actual effect of the label generation is checked by introducing new data.
8. The method for establishing the big data service platform for the small and medium-sized micro enterprises according to claim 1, wherein in the third step, when the enterprise industry chain analysis platform is established, the analysis is performed from the perspective of the association relationship of the industry chain where the main body is located based on the knowledge graph technology, the evolution subgraph structure is discovered through a frequent subgraph discovery technical means, and is mapped to the corresponding analysis dimension, the analysis dimension is expanded, and the identification and tracking of the dynamic evolution of the industry chain are realized from a new perspective.
CN202110210402.4A 2021-02-25 2021-02-25 Method for establishing big data service platform of small and medium-sized micro-enterprises Pending CN112837199A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110210402.4A CN112837199A (en) 2021-02-25 2021-02-25 Method for establishing big data service platform of small and medium-sized micro-enterprises

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110210402.4A CN112837199A (en) 2021-02-25 2021-02-25 Method for establishing big data service platform of small and medium-sized micro-enterprises

Publications (1)

Publication Number Publication Date
CN112837199A true CN112837199A (en) 2021-05-25

Family

ID=75933347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110210402.4A Pending CN112837199A (en) 2021-02-25 2021-02-25 Method for establishing big data service platform of small and medium-sized micro-enterprises

Country Status (1)

Country Link
CN (1) CN112837199A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109272155A (en) * 2018-09-11 2019-01-25 郑州向心力通信技术股份有限公司 A kind of corporate behavior analysis system based on big data
CN109739820A (en) * 2018-12-29 2019-05-10 科技谷(厦门)信息技术有限公司 A kind of E-government information service system based on big data analysis
CN109993644A (en) * 2017-12-29 2019-07-09 航天信息股份有限公司 A kind of portrait determines method, apparatus, electronic equipment and storage medium
CN110489560A (en) * 2019-06-19 2019-11-22 民生科技有限责任公司 The little Wei enterprise portrait generation method and device of knowledge based graphical spectrum technology
CN110796470A (en) * 2019-08-13 2020-02-14 广州中国科学院软件应用技术研究所 Market subject supervision and service oriented data analysis system
CN112115277A (en) * 2020-09-28 2020-12-22 中国建设银行股份有限公司 Knowledge graph-based integrated circuit industrial chain identification method and system
CN112131275A (en) * 2020-09-23 2020-12-25 中国科学技术大学智慧城市研究院(芜湖) Enterprise portrait construction method of holographic city big data model and knowledge graph
CN112182246A (en) * 2020-09-28 2021-01-05 上海市浦东新区行政服务中心(上海市浦东新区市民中心) Method, system, medium, and application for creating an enterprise representation through big data analysis

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993644A (en) * 2017-12-29 2019-07-09 航天信息股份有限公司 A kind of portrait determines method, apparatus, electronic equipment and storage medium
CN109272155A (en) * 2018-09-11 2019-01-25 郑州向心力通信技术股份有限公司 A kind of corporate behavior analysis system based on big data
CN109739820A (en) * 2018-12-29 2019-05-10 科技谷(厦门)信息技术有限公司 A kind of E-government information service system based on big data analysis
CN110489560A (en) * 2019-06-19 2019-11-22 民生科技有限责任公司 The little Wei enterprise portrait generation method and device of knowledge based graphical spectrum technology
CN110796470A (en) * 2019-08-13 2020-02-14 广州中国科学院软件应用技术研究所 Market subject supervision and service oriented data analysis system
CN112131275A (en) * 2020-09-23 2020-12-25 中国科学技术大学智慧城市研究院(芜湖) Enterprise portrait construction method of holographic city big data model and knowledge graph
CN112115277A (en) * 2020-09-28 2020-12-22 中国建设银行股份有限公司 Knowledge graph-based integrated circuit industrial chain identification method and system
CN112182246A (en) * 2020-09-28 2021-01-05 上海市浦东新区行政服务中心(上海市浦东新区市民中心) Method, system, medium, and application for creating an enterprise representation through big data analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
田娟;朱定局;杨文翰;: "基于大数据平台的企业画像研究综述", 计算机科学, no. 2, 15 November 2018 (2018-11-15), pages 68 - 72 *

Similar Documents

Publication Publication Date Title
Tiwari et al. Big data analytics in supply chain management between 2010 and 2016: Insights to industries
US20190340518A1 (en) Systems and methods for enriching modeling tools and infrastructure with semantics
US20210271809A1 (en) Machine learning process implementation method and apparatus, device, and storage medium
Sartori et al. Bankruptcy forecasting using case-based reasoning: The CRePERIE approach
Chang et al. Identification of the technology life cycle of telematics: A patent-based analytical perspective
CN112182246B (en) Method, system, medium, and application for creating an enterprise representation through big data analysis
Clinchant et al. Comparing machine learning approaches for table recognition in historical register books
CN110109908B (en) Analysis system and method for mining potential relationship of person based on social basic information
CN113626607B (en) Abnormal work order identification method and device, electronic equipment and readable storage medium
Jahani et al. Data science and big data analytics: A systematic review of methodologies used in the supply chain and logistics research
Tinelli et al. Embedding semantics in human resources management automation via SQL
Chen et al. Exploring technology opportunities and evolution of IoT-related logistics services with text mining
CN110033191B (en) Business artificial intelligence analysis method and system
Ren et al. An effective similarity determination model for case-based reasoning in support of low-carbon product design
Pawar et al. Big data analytics in logistics and supply chain management: a review of literature
Ferranti et al. A framework for evaluating ontology meta-matching approaches
Huerta et al. Data mining: Application of digital marketing in education
Bella et al. Semi-supervised approach for recovering traceability links in complex systems
CN112363996A (en) Method, system, and medium for building a physical model of a power grid knowledge graph
CN112699245A (en) Construction method and device and application method and device of budget management knowledge graph
CN112837199A (en) Method for establishing big data service platform of small and medium-sized micro-enterprises
Fan et al. Spatially enabled customer segmentation using a data classification method with uncertain predicates
Paul et al. Big Data Analytics for Marketing Intelligence
CN112102006A (en) Target customer acquisition method, target customer search method and target customer search device based on big data analysis
AU2021103329A4 (en) The investigation technique of object using machine learning and system.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination