CN116645153A - Commercial environment evaluation system based on multi-architecture NLP pre-training model and blockchain - Google Patents

Commercial environment evaluation system based on multi-architecture NLP pre-training model and blockchain Download PDF

Info

Publication number
CN116645153A
CN116645153A CN202310618083.XA CN202310618083A CN116645153A CN 116645153 A CN116645153 A CN 116645153A CN 202310618083 A CN202310618083 A CN 202310618083A CN 116645153 A CN116645153 A CN 116645153A
Authority
CN
China
Prior art keywords
data
training model
nlp
training
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310618083.XA
Other languages
Chinese (zh)
Inventor
时聪聪
闫晶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zero Data Technology Co ltd
Beijing Zero Vision Network Technology Co ltd
Original Assignee
Beijing Zero Data Technology Co ltd
Beijing Zero Vision Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zero Data Technology Co ltd, Beijing Zero Vision Network Technology Co ltd filed Critical Beijing Zero Data Technology Co ltd
Priority to CN202310618083.XA priority Critical patent/CN116645153A/en
Publication of CN116645153A publication Critical patent/CN116645153A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

A commercial environment evaluation system based on a multi-architecture NLP pre-training model and a blockchain relates to the technical field of artificial intelligence, and the system comprises: the enterprise big data middle platform module is used for acquiring the authenticated training sample data; the NLP pre-training model preprocessing module is used for acquiring and preprocessing data based on the trained NLP pre-training model and the investigation requirements of the user on the target commercial environment, and generating an investigation design scheme on the target commercial environment; the heterogeneous NLP pre-training model checking module is used for checking the research design scheme based on the trained heterogeneous NLP pre-training model to obtain a final research design scheme; the data evaluation informatization management module is used for quantitatively evaluating the final investigation design scheme and the investigation data reliability. By implementing the technical scheme provided by the application, the accuracy of the evaluation of the commercial environment can be improved.

Description

Commercial environment evaluation system based on multi-architecture NLP pre-training model and blockchain
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a commercial environment evaluation system based on a multi-architecture NLP pre-training model and a blockchain.
Background
The evaluation of a commercial environment can refer to a process of scientifically evaluating and monitoring a commercial environment of a city, region or enterprise, and generally adopts a method of combining quantification and qualitative performance, and comprises investigation analysis work such as questionnaire investigation, expert interview, data analysis, user demand analysis and the like. The subsequent data processing process then needs to extract useful information from the massive questionnaires, voice, image, data forms to support the consultation scheme, system and product design and overall solution at the subsequent stage.
The traditional data processing mode mainly depends on a mode that manual processing is mainly and an informatization technology is auxiliary. The intelligent level of informatization auxiliary data processing is limited, mainly focuses on the aspects of voice text translation, data query and retrieval and the like, and still mainly depends on manual processing in the aspects of data content and data conclusion of the acquired information. The manual processing often brings low efficiency, large data unification difficulty and human error risk. In addition, the manual data processing is limited by the fact that the manual processing judgment rules are different, and personnel experience is different, so that the problem that the data credibility standards are not uniform is often caused, and the accuracy of the evaluation of the business environment of an enterprise is low.
Disclosure of Invention
The application provides a commercial environment evaluation system based on a multi-architecture NLP pre-training model and a blockchain, which can improve the accuracy of commercial environment evaluation of enterprises.
In a first aspect, the present application provides a system for evaluating a commercial environment based on a multi-architecture NLP pre-training model and a blockchain, the system comprising:
the enterprise big data middle platform module is used for acquiring the authenticated training sample data;
the NLP pre-training model preprocessing module is used for training an NLP pre-training model according to the training sample data, and carrying out data acquisition and preprocessing on the basis of the trained NLP pre-training model and the investigation requirements of a user on a target commercial environment to generate an investigation design scheme of the target commercial environment;
the heterogeneous NLP pre-training model checking module is used for training a heterogeneous NLP pre-training model according to the training sample data, checking the investigation design scheme based on the trained heterogeneous NLP pre-training model, and obtaining a final investigation design scheme;
the data evaluation informatization management module is used for acquiring investigation data acquired according to the final investigation design scheme and acquiring an evaluation result of carrying out reliability quantitative evaluation on the final investigation design scheme and the investigation data.
By adopting the technical scheme, the data acquisition and pretreatment are carried out on the basis of the NLP pre-training model and the investigation requirements of the user on the target commercial environment, the investigation design scheme of the target commercial environment is generated, the investigation design scheme is checked on the basis of the heterogeneous NLP pre-training model, the final investigation design scheme is obtained, the intrinsic defects of the NLP model with a single framework are favorably avoided through the mutual check of the two heterogeneous NLP pre-training models, the evaluation of the data is carried out through the data evaluation informatization management module, and the accuracy of the evaluation on the commercial environment can be further improved.
Optionally, the system further comprises: and the enterprise CA center node module is used for authenticating the sample data to obtain authenticated training sample data and transmitting the authenticated training sample data to the enterprise big data center module.
By adopting the technical scheme, the authenticity and the integrity of the data can be ensured by authenticating the sample data through the enterprise CA central node module, and the credibility of the used training sample data is ensured.
Optionally, the enterprise CA central node module and the distributed intelligent node module form a private chain, and the system further includes: the distributed intelligent node module is used for generating a new block of the private chain based on the evaluation result; the public service module of block chain, is used for providing public digital certificate authorization for the central node module of the said enterprise CA, also is used for forming the public chain with the central node module of the said enterprise CA, according to the consensus mechanism of the said public chain, authenticate the new district block of the said private chain, and finish the new district block of the said private chain is on the said public chain after authenticating and passing.
By adopting the technical scheme, the private chain formed by the enterprise CA central node module and the distributed intelligent node module and the public chain formed by the block chain public service module comprising the trusted enterprise CA central node module respectively authenticate the new area block in the private chain by adopting the data authentication mechanism combining the private chain and the public chain, so that the credibility of the data can be ensured in the investigation process, and the data is prevented from being tampered.
Optionally, the enterprise big data middle platform module is further configured to obtain a new NLP training sample set based on the evaluation result, and send the new NLP training sample set to the NLP pre-training model preprocessing module and the heterogeneous NLP pre-training model checking module, so that the NLP pre-training model and the heterogeneous NLP pre-training model perform secondary training based on the new NLP training sample set, and model upgrading is completed.
By adopting the technical scheme, the positive and negative samples with the amplification are generated based on the evaluation result, the NLP pre-training model and the heterogeneous NLP pre-training model are secondarily trained based on the positive and negative samples, the model performance is continuously optimized, the model is updated, and the accuracy of the evaluation of the commercial environment can be further improved.
Optionally, the distributed intelligent node module includes: the private chain new block generation unit is used for acquiring the evaluation result, determining the data content of the private chain new block based on the evaluation result, authenticating the data content of the private chain new block according to the consensus mechanism of the private chain, and linking to the tail of the private chain.
By adopting the technical scheme, the generation of the new block of the private chain is an important component part of the block chain, and the data of the new block is authenticated by a consensus mechanism of the private chain, so that the safety and the credibility of the data of the private chain are ensured.
Optionally, the blockchain public service module includes: and the public chain new region block generating unit is used for packaging the data content of the new block of the private chain according to a public chain block data format based on a Qtum quantum chain, generating the data content of the new block of the public chain, authenticating the data content of the new block of the public chain according to a Qtum consensus mechanism, and linking to the tail of the public chain.
By adopting the technical scheme, the private chain is up-linked to the public chain by packing the data of the private chain into the data format of the public chain block, and the data content of the new block of the public chain is authenticated according to the Qtum consensus mechanism, so that the safety and reliability of the public chain are ensured, the data transmission and sharing capacity of the block chain is improved, and the application and popularization of the block chain are promoted.
Optionally, the NLP pre-training model preprocessing module is configured to train the NLP pre-training model according to the training sample data, perform data acquisition and preprocessing based on the trained NLP pre-training model and an investigation requirement of a user on a target commerce environment, and generate an investigation design scheme for the target commerce environment, and includes: the NLP pre-training model preprocessing module is specifically configured to train the NLP pre-training model according to the training sample data, respond to an investigation requirement input by a user for evaluating a target commercial environment, and perform data acquisition and preprocessing based on a ChatGPT preprocessing algorithm in the trained NLP pre-training model and the investigation requirement to generate an investigation design scheme for the target commercial environment.
By adopting the technical scheme, the data acquisition and the preprocessing are carried out based on the research requirement of evaluating the target commercial environment by the user and the ChatGPT preprocessing algorithm in the NLP pre-training model, and the workload of manually acquiring and processing the data is reduced by intelligent processing of the data, and meanwhile, the uniformity and the credibility of the data processing are improved, and the accuracy of evaluating the commercial environment of an enterprise is further improved.
Optionally, the heterogeneous NLP pre-training model checking module is configured to train the heterogeneous NLP pre-training model according to the training sample data, and check the investigation design scheme based on the trained heterogeneous NLP pre-training model, to obtain a final investigation design scheme, including: the heterogeneous NLP pre-training model checking module is specifically configured to train a heterogeneous NLP pre-training model according to the training sample data, check the investigation design scheme based on the BERT model in the trained heterogeneous NLP pre-training model to obtain a scheme evaluation coincidence value, take data with the scheme evaluation coincidence value being greater than a preset value as a positive sample set, and generate a final investigation design scheme based on the positive sample set.
By adopting the technical scheme, the investigation design scheme generated based on the NLP pre-training model is checked through the heterogeneous NLP pre-training model which is different from the NLP pre-training model in architecture, so that the scheme evaluation coincidence value is obtained, namely, the mutual check of the two models is realized, the data with the scheme evaluation coincidence value larger than the preset value is used as a positive sample set, the final investigation design scheme is generated based on the positive sample set, and the data accuracy of the scheme design stage is improved.
Optionally, the data evaluation informationized management module is configured to obtain investigation data collected according to the final investigation design scheme, and obtain a data collection evaluation result for performing reliability quantitative evaluation on the final investigation design scheme and the investigation data, where the data collection evaluation result includes: the data evaluation informatization management module is specifically configured to obtain investigation data collected according to the final investigation design scheme, obtain reliability verification results of the investigation data and the final investigation design scheme, and divide the reliability verification results into a second positive sample set and a second negative sample set according to a preset division rule; analyzing the auditing result according to a hierarchical analysis model in a preset indexing auditing model to obtain a plurality of evaluation index results, and generating a data acquisition evaluation result of credibility quantitative evaluation based on the second positive sample set and the plurality of evaluation index results.
By adopting the technical scheme, the collected investigation data and the final investigation design scheme are subjected to reliability verification, so that the data collection stage can be evaluated, the reliability quantitative evaluation data can be obtained, and the accuracy of the enterprise's commercial environment evaluation can be further improved.
In a second aspect of the application, an electronic device is provided, which comprises the multi-architecture NLP pre-training model and blockchain based merchant environment assessment system.
In summary, one or more technical solutions provided in the embodiments of the present application at least have the following technical effects or advantages:
1. according to the application, data acquisition and pretreatment are carried out on the basis of the NLP pre-training model and the investigation requirements of users on the target commercial environment, so that an investigation design scheme of the target commercial environment is generated, and then the investigation design scheme is checked on the basis of the heterogeneous NLP pre-training model, so that a final investigation design scheme is obtained, the intrinsic defects of the NLP model with a single framework are avoided by checking the two heterogeneous NLP pre-training models with each other, and the evaluation of data is carried out through the data evaluation informatization management module, so that the accuracy of the evaluation on the commercial environment can be further improved;
2. the application generates the amplified positive and negative samples based on the evaluation result, and carries out secondary training on the NLP pre-training model and the heterogeneous NLP pre-training model based on the positive and negative samples, so as to continuously optimize the model performance, realize the upgrading of the model and further improve the accuracy of the evaluation on the commercial environment;
3. the application adopts a data authentication mechanism combining the private chain and the public chain, and the private chain respectively formed by the enterprise CA central node module and the distributed intelligent node module and the public chain formed by the public service module of the blockchain including the trusted enterprise CA central node module respectively authenticate the new area block in the private chain, thereby ensuring the credibility of the data in the investigation process and avoiding the falsification of the data.
Drawings
FIG. 1 is a schematic block diagram of a system for evaluating a commercial environment based on a multi-architecture NLP pre-training model and a blockchain according to an embodiment of the present application;
FIG. 2 is a block diagram of a preferred multi-architecture NLP pre-training model and blockchain based commercial environment assessment system according to an embodiment of the present application.
Reference numerals illustrate: 10. the enterprise big data middle platform module; 20. an NLP pre-training model preprocessing module; 30. heterogeneous NLP pre-training model checking module; 40. the data evaluation informatization management module; 50. an enterprise CA central node module; 60. a distributed intelligent node module; 70. and the block chain public service module.
Detailed Description
In order that those skilled in the art will better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments.
In describing embodiments of the present application, words such as "for example" or "for example" are used to mean serving as examples, illustrations, or descriptions. Any embodiment or design described herein as "such as" or "for example" in embodiments of the application should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "or" for example "is intended to present related concepts in a concrete fashion.
In the description of embodiments of the application, the term "plurality" means two or more. For example, a plurality of systems means two or more systems, and a plurality of screen terminals means two or more screen terminals. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating an indicated technical feature. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
In order to facilitate understanding of the method and system provided by the embodiments of the present application, a description of the background of the embodiments of the present application is provided before the description of the embodiments of the present application.
In research and evaluation business such as a commercial environment, a park service, an enterprise consultation and the like, research scheme design and data result evaluation are relatively dependent on personnel experience and consume manual processing links. Traditional investigation scheme designs rely mainly on manual handling. The intelligent level of informatization auxiliary data processing is limited, and the intelligent level is mainly concentrated on primary levels such as voice text translation, data query and retrieval, and the like, so that mental labor such as manual scheme design development cannot be replaced in a large scale. In the aspects of data content and data conclusion of the letter collection, manual processing often brings low efficiency, large data unification difficulty and human error risk. In addition, the manual data processing is limited by the fact that the manual processing judgment rules are different, and personnel experience is different, so that the problem that the data credibility standards are not uniform is often caused.
In view of the foregoing background description, those skilled in the art will appreciate that the problems associated with the prior art are solved and a complete description of the embodiments of the present application is provided below, with reference to the accompanying drawings in which embodiments of the present application are shown, wherein it is apparent that the embodiments described are only some, but not all, embodiments of the present application.
Along with the continuous development of big data business, enterprises accumulate massive multidimensional big data in a long-term business expansion process, but the application of the data mainly stays in the stages of data archiving, data query and the like, and the data value of the enterprise is fully exerted by technical means such as knowledge distillation, knowledge graph, knowledge datamation, data intellectualization and the like. The accumulated big data is difficult to form intelligent support for data processing in the research and analysis stage, so that the labor investment is obviously reduced, and the data acquisition problem is effectively solved.
In order to fully mine the value of accumulated big data and clarify expert experience, the application fully applies the technical result of a leading edge NLP model, converts the traditional link which is seriously dependent on manual processing into a problem which can be calculated and quantized, establishes a novel data acquisition evaluation system for data intelligent processing, letter collection and manual assistance and key decision adding by adopting a method of combining a multi-framework NLP pre-training model and a blockchain technology, and effectively solves the problems of data letter collection, manual dependence and data tamper resistance in the market research process.
Referring to fig. 1, a schematic block diagram of a commercial environment evaluation system based on a multi-architecture NLP pre-training model and a blockchain is provided in an embodiment of the present application.
The commercial environment evaluation system based on the multi-architecture NLP pre-training model and the blockchain comprises: the system comprises an enterprise big data middle platform module 10, an NLP pre-training model preprocessing module 20, a heterogeneous NLP pre-training model checking module 30 and a data evaluation informatization management module 40.
The enterprise big data middle station module 10 is a unified platform integrating a database and data services supporting data access, massive big data storage and processing and providing data services and algorithm support for business applications based on the data algorithm service fused by the algorithm modules.
In the embodiment of the present application, the enterprise big data middle station module 10 may be configured to obtain authenticated training sample data, specifically, the training sample data may be a historical market research data set, an open big model massive parameter and an initial target business environment, and then authenticate the sample data to obtain authenticated training sample data, and send the authenticated training sample data to the NLP pre-training model preprocessing module 20 and the heterogeneous NLP pre-training model checking module 30, so that the corresponding model trains according to the authenticated training sample data.
The NLP pre-training model preprocessing module 20 is an artificial intelligence technology driven Natural Language Processing (NLP) tool, and the NLP pre-training model is a deep learning model for pre-training on a large corpus of natural language processing, and is used for solving the natural language processing task, and learning the structure and the semantics of the language, so that the method can be used for fine tuning and application in various NLP tasks, such as text classification, named entity recognition, emotion analysis, question-answering system and the like. And in particular, professional semantic analysis can be provided in the aspects of data acquisition and investigation analysis, and related reasoning and judging results can be output according to the format agreed by a user, and can be in the forms of texts, languages or charts.
In the embodiment of the present application, the NLP pre-training model preprocessing module 20 is configured to train the NLP pre-training model according to the training sample data after receiving the training sample data sent by the enterprise big data middle station module, to train the trained NLP pre-training model. And based on the trained NLP pre-training model and the investigation requirements of the user on the target commercial environment, data acquisition and preprocessing are carried out to generate an investigation design scheme for the target commercial environment.
Heterogeneous NLP pre-training model checking module 30 is a natural language processing tool with different technical architecture or technical route from NLP pre-training model preprocessing module 20. The method is mainly used for checking and correcting the natural language text, can automatically detect and correct errors and irregular use in the text, and improves the accuracy and efficiency of natural language processing.
In the embodiment of the application, the heterogeneous NLP pre-training model checking module 30 is used for training the heterogeneous NLP pre-training model according to training sample data, checking the research design scheme based on the trained heterogeneous NLP pre-training model to obtain a final research design scheme, realizing the mutual checking of the two models, and further improving the data accuracy of the scheme design stage.
The data evaluation informatization management module 40 can be an informatization management system facing to data acquisition and processing service requirements, can receive data service and algorithm service of the enterprise big data center module 10, support users to access or obtain service based on intelligent terminals, integrate artificial intelligent service support of the NLP pre-training model preprocessing module 20 and the heterogeneous NLP pre-training model checking module 30 through support of the enterprise big data center module 10, provide intelligent data acquisition and processing service for users, enable the users to use the module to create investigation and data acquisition requirements, call the NLP intelligent service to perform manual auditing and data release processing, and provide informatization management of the whole data acquisition and processing process for the users.
In the embodiment of the present application, the data evaluation informationized management module 40 is configured to receive a final investigation design scheme, acquire investigation data collected according to the final investigation design scheme, and acquire an evaluation result of performing reliability quantitative evaluation on the final investigation design scheme and the investigation data, so as to implement quantitative evaluation on validity and reliability of a data processing result.
Based on the above embodiments, as an alternative embodiment, a vendor environment evaluation system based on a multi-architecture NLP pre-training model and a blockchain may further include: enterprise CA center node module 50, distributed intelligent node module 60, and blockchain public service module 70.
The enterprise CA central node module 50 may be a unified digital certificate authority of an enterprise, and provides digital certificate authority and authentication for distributed nodes within the enterprise management and control boundary range, so as to protect information security and data integrity of the enterprise.
In the embodiment of the present application, the enterprise CA center node module 50 is configured to authenticate sample data to obtain authenticated training sample data, and send the authenticated training sample data to the enterprise big data middle station module 10 for training by a corresponding model, so as to ensure the credibility of the used training sample data.
The distributed intelligent node module 60 may be an authenticated intelligent terminal node on the user side, including physical devices and virtual resources virtualized by resources. The user may access the data evaluation information management module 40 in a distributed manner based on the distributed intelligent node module 60 to obtain the integrated data and algorithm services. The distributed intelligent node module 60 further has an automatic language recognition function and a text analysis tool based on NLP, and can realize voice-to-text, text semantic analysis and data base preprocessing functions, wherein the functions can be obtained by accessing the data evaluation informatization management module 40 on line, and can also be realized by an edge side lightweight tool off line.
In an embodiment of the present application, the distributed intelligent node module 60 is configured to generate a new block of the private chain based on the evaluation result. The distributed intelligent node module 60 also forms a node set of the enterprise private blockchain with the enterprise CA center node module 50. Meanwhile, the distributed intelligent node module 60 may be a distributed node of the private enterprise chain, and the public key and private key pair of the node is stored, and is authorized and managed through the digital CA certificate. The CA certificate is transacted and authenticated by the enterprise CA center node module 50.
Further, the distributed intelligent node module 60 includes a private chain new block generating unit, and the specific private chain new block generating process may be: the terminal device corresponding to the distributed intelligent node module 60 obtains the digital certificate authorization from the enterprise CA root node server, thereby completing the identity authentication of the terminal device. The new block with the chain generating unit acquires an evaluation result sent by the data evaluation informatization management module, analyzes the evaluation result based on the evaluation result and a preset dividing rule to obtain a second positive sample set and a second negative sample set, and determines new block content of the private chain according to the second positive sample set. And then encrypts the public key and private key pair stored by the distributed node of the enterprise private chain corresponding to the distributed intelligent node module 60, and broadcasts the public key and private key pair to other nodes in the enterprise management domain. And the other distributed nodes authenticate the data content of the new block of the private chain according to the consensus mechanism of the private chain, and the new block takes effect and is automatically linked to the tail of the private chain after the authentication is passed. The private chain operates within an enterprise independent network and is physically isolated from the public chain portion to ensure communication and data security. The enterprise CA root node communicates with other nodes of the Qtum public chain by penetrating through the physical isolation facility.
The blockchain public service module 70 may be a government, industry certified public chain service mechanism, a set of resources including a series of trust nodes and service mechanisms, including the enterprise CA central node module 50. Blockchain public service module 70 may also provide public digital certificate authorities for enterprise CA central node module 50. The enterprise CA center node then acts as a root node within the enterprise regulatory boundary range to further provide digital certificate services to the distributed intelligent node module 60.
In the embodiment of the present application, the blockchain public service module 70 is configured to provide public digital certificate authority for the enterprise CA center node module 50, and is further configured to form a public chain with the enterprise CA center node module, authenticate a new block of the private chain according to a public chain consensus mechanism, and complete the new block of the private chain to be on the public chain after the authentication is passed.
Further, the blockchain public service module 70 includes a public chain new block generating unit, and the specific public chain new block generating process may be: the enterprise CA center node module 50 provides root node digital certificate authority for the public chain service system provided by the public chain public service module 70, and the Qtum quantum chain provides root node digital certificate authority for the public chain public service system CA root node server, generates a root CA certificate, and completes identity authentication and trust of the enterprise CA root node server. After the public chain new block generating unit acquires the private chain to generate an effective new block, an enterprise CA root node server serving as a Qtum quantum chain trust node packages the data content of the private chain new block according to a public chain block number content format to generate the data content of the public chain new block, encrypts the data content based on a stored public chain public key private key pair, and then broadcasts the encrypted data content to other nodes of the Qtum quantum chain public chain. And the other distributed nodes of the Qtum quantum chain authenticate the data content of the new block of the public chain according to a consensus mechanism of the Qtum, and the data content of the new block of the public chain takes effect and is automatically linked to the tail of the public chain after the authentication is passed.
Referring to fig. 2, a schematic block diagram of a preferred multi-architecture NLP pre-training model and blockchain based commercial environment evaluation system is provided in an embodiment of the present application.
The enterprise big data middle platform module 10 in the embodiment of the application can be a 'digital cube' big data platform, which is a big data platform supporting mass data storage, data stream processing, data cleaning, data mining and data intelligent service provided by demands including market investigation and research including business environment evaluation, digital operation and the like.
The data evaluation informatization management module 40 may be a "business communication" business environment evaluation system, which is an informatization system for providing support for the overall process functions of the business environment evaluation and investigation design scheme stage and the data acquisition and evaluation stage. Meanwhile, the 'Yingchang' barrage environment evaluation system integrates an algorithm service support function of intelligent data processing under the support of a 'digital cube' big data platform, and intelligent support is provided for data and information processing of each link. The algorithm services are uniformly provided by the 'digital cube' big data platform, and a user can access the 'business communication' business environment evaluation system through the terminal equipment corresponding to the distributed intelligent node module 60 to obtain corresponding services.
The NLP pre-training model preprocessing module 20 may be a ChatGPT preprocessing algorithm model in the embodiment of the present application, the ChatGPT preprocessing algorithm is a natural language processing tool, the module in the system firstly utilizes the opened billions of external public data, and combines with the large historical data of the business environment evaluation accumulated by the enterprise to train and form an initial model to understand and learn human language, especially provides professional semantic analysis in terms of data collection and investigation analysis, outputs reasoning and judging results, and outputs text, language and chart according to the format agreed by the user. After effective positive samples and negative samples are generated in the evaluation process of the commercial environment and are uploaded to the enterprise big data platform in a centralized manner, the ChatGPT can be trained for the second time according to the amplified and optimized sample set to optimize the model performance. Meanwhile, the ChatGPT model is used as a standardized intelligent data service of the system, is incorporated into an algorithm service system of a 'digital cube' big data platform, and is provided for a 'business through' business environment evaluation system call.
Heterogeneous NLP pre-training model checking module 30 can be a BERT checking algorithm model in the embodiment of the application, and adopts a pre-training language model based on a transformer, which is different from ChatGPT. The BERT is more efficient in performing text classification, question answering, etc., specific NLP tasks due to the different architecture, pre-training approach. The BERT algorithm model is used for researching design scheme and checking research data, and is partially complementary with ChatGPT pretreatment, so that the data processing quality problem possibly caused by the adoption of a single model is effectively avoided. The BERT checking algorithm model is trained by obtaining initial samples and amplified and optimized samples through a 'number cube' big data platform so as to continuously optimize the model performance.
The enterprise CA center node module 50 may authenticate a root node server for digital certificates in an enterprise management domain, obtain public digital certificate authorization services from a Qtum public chain, and issue and manage certificates to other distributed nodes in the domain in an embodiment of the present application. Meanwhile, the enterprise CA root node is also a node of the enterprise private chain, a public key and private key pair is also stored, and the new block generated in the domain is authenticated according to a consensus mechanism. Further, the enterprise CA root node is also a public chain distributed node for Qtum quantum chain trust. The enterprise CA root node simultaneously bears the function of uploading block data to the enterprise big data platform.
The distributed intelligent node module 60 may be an intelligent terminal device in the embodiment of the present application, and the intelligent terminal may include, but is not limited to, a computer, a tablet, a mobile phone, a workstation, a server, a cloud server, a virtual machine, and the like. The user accesses the 'Yingshitong' barrage environment evaluation system based on the intelligent terminal equipment, and one-stop type obtains the whole process data and algorithm service of the barrage environment. The intelligent terminal equipment can obtain digital certificate authorization from the enterprise CA root node and obtain identity authentication. Meanwhile, the intelligent terminal equipment is used as a distributed node of an enterprise private chain, stores a public key and private key pair, and can authenticate a new area generated in the domain according to a consensus mechanism. Each intelligent terminal device has the functions of voice recognition, text analysis and data preprocessing, and the implementation mode comprises the steps of acquiring services from a 'digital cube' large data platform in an online mode or supporting the required functions of voice, text and data recognition and processing through built-in voice recognition, text analysis and data processing tools.
The blockchain public service module 70 may be a Qtum quantum chain, which is one of the domestic typical public blockchains. The Qtum quantum chain is compatible with the mainstream blockchain ecological system, and provides business application services by creating a simple and practical decentralization application. The Qtum quantum chain provides an open source intelligent contract platform and a value transmission protocol which are decentralised, a rights and interests proving consensus mechanism is adopted, and nodes acquire rewards through verification transactions. If some block parameters need to be modified, communities formed by the distributed central nodes need to vote, and the problem of data tampering is effectively solved.
As a preferred embodiment, the specific implementation of the above scheme will be described in detail.
Specifically, the process in the investigation design phase may be: the user accesses the "Yingshitong" barrage environment evaluation system based on the terminal equipment corresponding to the distributed intelligent node module 60, and the user autonomously inputs the overall requirements of the target barrage environment evaluation, and automatically generates investigation requirements in a voice and text "chat" mode through the ChatGPT preprocessing algorithm in the NLP pre-training model preprocessing module 20. And then data acquisition and preprocessing are carried out based on a ChatGPT preprocessing algorithm and investigation requirements in the trained NLP pre-training model, so as to generate an investigation design scheme for the target commercial environment. The investigation design scheme comprises questionnaire design, expert scheme and content, qualitative research content, data scheme and acquisition mode required by quantitative research, and is automatically generated through a ChatGPT preprocessing algorithm.
After receiving the investigation design scheme, the heterogeneous NLP pre-training model checking module 30 invokes the BERT model in the trained heterogeneous NLP pre-training model to check the investigation design scheme and investigation requirements, and generates a consistency evaluation result according to a preset consistency judgment rule, wherein the consistency judgment result comprises overall qualitative judgment and preset dimensionality consistency quantitative evaluation, and for inconsistent contents, a comparison processing result is given by using the BERT model. If the consistency evaluation result is larger than the set threshold, evaluating the reliability of the consistency evaluation result manually, automatically performing reliability quantification scoring by the classification mark, classifying the reliability into a first positive sample set and a first negative sample set with reliability indexes based on the reliability, and fusing the first positive sample set with the reliability into a final investigation design scheme.
Specifically, the process in the data acquisition and evaluation stage may be: and carrying out market investigation activities such as questionnaire investigation, expert interview, data collection, discussion and the like according to the final investigation design scheme, calling an NLP pre-training model as a processing tool, and carrying out process processing and information refining on voice, text, pictures and video data to obtain collected investigation data. The data evaluation informatization management module 40 acquires the collected investigation data, is based on a "Yingshitong" barracker environment evaluation system, calls a ChatGPT preprocessing algorithm model to preprocess the collected investigation data, and automatically generates questionnaires, interview results, qualitative research reports and quantitative analysis reports.
Further, personnel check the final investigation design scheme based on a 'Yingshitong' barrackenvironment evaluation system and call a BERT check algorithm model, generate scheme compliance evaluation results, call manual check algorithm function services provided by the 'Yingshitong' barrackenvironment system and index check model services provided by a 'digital cube' big data platform, evaluate the preprocessed results and checked results item by item, and classify according to the consistency, basically consistency, inconsistency and incapability of judging the two results. And manually determining the letter collecting items of the final data result, generating a second positive sample set, automatically generating a second negative sample set without the letter collecting data result, and respectively scoring the letter collecting and non-letter collecting data items according to 0-10 quantization. After the manual auxiliary auditing is finished, the index auditing model of the 'digital cube' big data platform automatically carries out comprehensive assessment on the preprocessed result and the checked result according to the built-in analytic hierarchy model, wherein a typical analytic hierarchy model sets a first-level index and a second-level index of layering and dimensionality, and generates a first-level evaluation index and a second-level evaluation index according to the user scoring result. And generating a data acquisition evaluation result according to the second positive sample set and the index evaluation.
Still further, the first negative sample set generated in the investigation design stage and the second negative sample set generated in the data collection evaluation stage are formed into a new negative sample data set with reliability, and new blocks of the blockchain are generated based on the second positive sample data, and the generation process of the new blocks is already described above and will not be described here. Positive sample block data containing credibility is obtained from a block chain, the positive sample block data and a newly added negative sample data set are packaged into a new NLP training sample set, the new NLP training sample set is sent to the enterprise big data middle station module 10 for storage, the enterprise big data middle station module 10 sends the new NLP training sample set to the NLP pre-training model preprocessing module 20 and the heterogeneous NLP pre-training model checking module 30 for carrying out secondary training on the NLP pre-training model and the heterogeneous NLP pre-training model based on the new NLP training sample set, model upgrading is completed, and accuracy of evaluation on a commercial environment can be further improved.
It should be noted that: in the system provided in the above embodiment, when implementing the functions thereof, only the division of the above functional modules is used as an example, in practical application, the above functional allocation may be implemented by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to implement all or part of the functions described above. In addition, the system and method embodiments provided in the foregoing embodiments belong to the same concept, and specific implementation processes of the system and method embodiments are detailed in the method embodiments, which are not repeated herein.
The present application also provides an electronic device that may include a commercial environment evaluation system based on a multi-architecture NLP pre-training model and a blockchain, and that may perform all of the above functions of the commercial environment evaluation system based on the multi-architecture NLP pre-training model and the blockchain.
The embodiments of the present application are all preferred embodiments of the present application, and are not intended to limit the scope of the present application, wherein like reference numerals are used to refer to like elements throughout. Therefore: all equivalent changes in structure, shape and principle of the application should be covered in the scope of protection of the application.

Claims (10)

1. A commercial environment evaluation system based on a multi-architecture NLP pre-training model and a blockchain, the system comprising:
the enterprise big data middle platform module is used for acquiring the authenticated training sample data;
the NLP pre-training model preprocessing module is used for training an NLP pre-training model according to the training sample data, and carrying out data acquisition and preprocessing on the basis of the trained NLP pre-training model and the investigation requirements of a user on a target commercial environment to generate an investigation design scheme of the target commercial environment;
the heterogeneous NLP pre-training model checking module is used for training a heterogeneous NLP pre-training model according to the training sample data, checking the investigation design scheme based on the trained heterogeneous NLP pre-training model, and obtaining a final investigation design scheme;
the data evaluation informatization management module is used for acquiring investigation data acquired according to the final investigation design scheme and acquiring an evaluation result of carrying out reliability quantitative evaluation on the final investigation design scheme and the investigation data.
2. The multi-architecture NLP pre-training model and blockchain based commercial environment assessment system of claim 1, wherein the system further comprises:
and the enterprise CA center node module is used for authenticating the sample data to obtain authenticated training sample data and transmitting the authenticated training sample data to the enterprise big data center module.
3. The multi-architecture NLP pre-training model and blockchain-based business environment assessment system of claim 2, wherein the enterprise CA center node module and the distributed intelligent node module form a private chain, the system further comprising:
the distributed intelligent node module is used for generating a new block of the private chain based on the evaluation result;
the public service module of block chain, is used for providing public digital certificate authorization for the central node module of the said enterprise CA, also is used for forming the public chain with the central node module of the said enterprise CA, according to the consensus mechanism of the said public chain, authenticate the new district block of the said private chain, and finish the new district block of the said private chain is on the said public chain after authenticating and passing.
4. The multi-architecture NLP pre-training model and blockchain-based commercial environment evaluation system of claim 3, wherein the enterprise big data middle stage module is further configured to obtain a new NLP training sample set based on the evaluation result, and send the new NLP training sample set to the NLP pre-training model preprocessing module and the heterogeneous NLP pre-training model checking module for the NLP pre-training model and the heterogeneous NLP pre-training model to perform secondary training based on the new NLP training sample set, so as to complete model upgrading.
5. The multi-architecture NLP pre-training model and blockchain-based business environment assessment system of claim 3, wherein the distributed intelligent node module comprises:
the private chain new block generation unit is used for acquiring the evaluation result, determining the data content of the private chain new block based on the evaluation result, authenticating the data content of the private chain new block according to the consensus mechanism of the private chain, and linking to the tail of the private chain.
6. The multi-architecture NLP pre-training model and blockchain-based business environment assessment system of claim 5, wherein the blockchain common service module comprises:
and the public chain new region block generating unit is used for packaging the data content of the new block of the private chain according to a public chain block data format based on a Qtum quantum chain, generating the data content of the new block of the public chain, authenticating the data content of the new block of the public chain according to a Qtum consensus mechanism, and linking to the tail of the public chain.
7. The system for evaluating a commercial environment based on a multi-architecture NLP pre-training model and a blockchain according to claim 1, wherein the NLP pre-training model preprocessing module is configured to train an NLP pre-training model according to the training sample data, and perform data acquisition and preprocessing based on the trained NLP pre-training model and the research requirements of a user on a target commercial environment, to generate a research design scheme for the target commercial environment, and the method comprises the following steps:
the NLP pre-training model preprocessing module is specifically configured to train the NLP pre-training model according to the training sample data, respond to an investigation requirement input by a user for evaluating a target commercial environment, and perform data acquisition and preprocessing based on a ChatGPT preprocessing algorithm in the trained NLP pre-training model and the investigation requirement to generate an investigation design scheme for the target commercial environment.
8. The multi-architecture NLP pre-training model and blockchain-based commercial environment evaluation system of claim 1, wherein the heterogeneous NLP pre-training model checking module is configured to train a heterogeneous NLP pre-training model according to the training sample data, and check the investigation design scheme based on the trained heterogeneous NLP pre-training model, to obtain a final investigation design scheme, and comprises:
the heterogeneous NLP pre-training model checking module is specifically configured to train a heterogeneous NLP pre-training model according to the training sample data, check the investigation design scheme based on the BERT model in the trained heterogeneous NLP pre-training model to obtain a scheme evaluation coincidence value, take data with the scheme evaluation coincidence value being greater than a preset value as a positive sample set, and generate a final investigation design scheme based on the positive sample set.
9. The multi-architecture NLP pre-training model and blockchain-based commercial environment evaluation system of claim 1, wherein the data evaluation informationized management module is configured to obtain investigation data collected according to the final investigation design scheme, and obtain a data collection evaluation result for performing reliability quantitative evaluation on the final investigation design scheme and the investigation data, and comprises:
the data evaluation informatization management module is specifically configured to obtain investigation data collected according to the final investigation design scheme, obtain reliability verification results of the investigation data and the final investigation design scheme, and divide the reliability verification results into a second positive sample set and a second negative sample set according to a preset division rule; analyzing the auditing result according to a hierarchical analysis model in a preset indexing auditing model to obtain a plurality of evaluation index results, and generating a data acquisition evaluation result of credibility quantitative evaluation based on the second positive sample set and the plurality of evaluation index results.
10. An electronic device, characterized in that the electronic device comprises the commercial environment evaluation system based on the multi-architecture NLP pre-training model and the blockchain as claimed in any one of claims 1 to 9.
CN202310618083.XA 2023-05-29 2023-05-29 Commercial environment evaluation system based on multi-architecture NLP pre-training model and blockchain Pending CN116645153A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310618083.XA CN116645153A (en) 2023-05-29 2023-05-29 Commercial environment evaluation system based on multi-architecture NLP pre-training model and blockchain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310618083.XA CN116645153A (en) 2023-05-29 2023-05-29 Commercial environment evaluation system based on multi-architecture NLP pre-training model and blockchain

Publications (1)

Publication Number Publication Date
CN116645153A true CN116645153A (en) 2023-08-25

Family

ID=87622510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310618083.XA Pending CN116645153A (en) 2023-05-29 2023-05-29 Commercial environment evaluation system based on multi-architecture NLP pre-training model and blockchain

Country Status (1)

Country Link
CN (1) CN116645153A (en)

Similar Documents

Publication Publication Date Title
CN110796470B (en) Data analysis system for market subject supervision and service
CN111787090B (en) Intelligent treatment platform based on block chain technology
Mir et al. Realizing digital identity in government: Prioritizing design and implementation objectives for Aadhaar in India
CN104202339B (en) A kind of across cloud authentication service method based on user behavior
CN104375998A (en) Intelligentized project matching analysis tool and implementation method thereof
CN104050224B (en) Combining different type coercion components for deferred type evaluation
CN113011973B (en) Method and equipment for financial transaction supervision model based on intelligent contract data lake
CN111639914A (en) Block chain case information management method and device, electronic equipment and storage medium
CN104574110A (en) Digital credit authentication method
CN111967761A (en) Monitoring and early warning method and device based on knowledge graph and electronic equipment
CN109002470A (en) Knowledge mapping construction method and device, client
CN112328585A (en) Data processing method and device
CN111460139B (en) Intelligent management based engineering supervision knowledge service system and method
CN103853701A (en) Neural-network-based self-learning semantic detection method and system
CN111935269A (en) Data exchange method and system
CN116645153A (en) Commercial environment evaluation system based on multi-architecture NLP pre-training model and blockchain
CN116415203A (en) Government information intelligent fusion system and method based on big data
Mir et al. Digital identity evaluation framework for social welfare
CN115658785A (en) Financial subject bin construction method, device and medium for government affair data
CN115659214A (en) Energy industry data credible evaluation method based on PaaS platform
CN109583210A (en) A kind of recognition methods, device and its equipment of horizontal permission loophole
CN113034159A (en) Enterprise credible credit assessment system and method based on block chain prediction machine technology
CN115114495B (en) Airworthiness data management auxiliary method and system based on deep learning
Wu et al. Research on internet financial risk control based on deep learning algorithm
CN112287104A (en) Natural language processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination