CN112507102B

CN112507102B - Predictive deployment system, method, apparatus and medium based on pre-training paradigm model

Info

Publication number: CN112507102B
Application number: CN202011505461.6A
Authority: CN
Inventors: 王晨秋
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-12-18
Filing date: 2020-12-18
Publication date: 2022-04-29
Anticipated expiration: 2040-12-18
Also published as: CN112507102A

Abstract

The utility model discloses a prediction deployment system, method, device and medium based on a pre-training paradigm model, which relates to the artificial intelligence field such as deep learning, natural language processing and computer vision, the system can comprise: the deployment module is used for sending original text data corresponding to the task request of the user to the data preprocessing module, acquiring a preprocessing result returned by the data preprocessing module, sending the preprocessing result to the prediction module, acquiring a prediction result returned by the prediction module and providing the prediction result to the user; the data preprocessing module is used for preprocessing the original text data to obtain a preprocessing result meeting the prediction requirement and sending the preprocessing result to the deployment module; and the prediction module is used for calling a prediction engine of the deep learning framework, acquiring a prediction result returned after the prediction engine predicts according to the preprocessing result, and sending the prediction result to the deployment module. By applying the scheme disclosed by the invention, the learning and development cost of a user can be reduced, the processing efficiency is improved, and the like.

Description

Predictive deployment system, method, apparatus and medium based on pre-training paradigm model

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a system, a method, an apparatus, and a medium for predictive deployment based on a pre-trained paradigm model in the fields of deep learning, natural language processing, and computer vision.

Background

With the gradual popularization of the pre-training Language model in the field of Natural Language Processing (NLP), the importance of predictive deployment (i.e., prediction and deployment) based on the pre-training paradigm model is increasing. The pre-training paradigm refers to a training method of pre-training a language model and fine-tuning the model in natural language processing.

Currently, the following implementations are generally employed: and directly calling a related prediction interface of a deep learning framework in a project of a user, and finishing data preprocessing, various related adaptations and the like by the user. This approach requires the user to be familiar with the internal details of the pre-processing and prediction logic of the pre-trained model, etc., is highly demanding on the user's level, is error prone, requires high learning and development costs, etc.

Disclosure of Invention

The present disclosure provides systems, methods, apparatus, and media for predictive deployment based on a pre-trained paradigm model.

A predictive deployment system based on a pre-trained paradigm model, comprising: the system comprises a deployment module, a data preprocessing module and a prediction module;

the deployment module is used for sending original text data corresponding to a task request of a user to the data preprocessing module, acquiring a preprocessing result returned by the data preprocessing module, sending the preprocessing result to the prediction module, acquiring a prediction result returned by the prediction module, and providing the prediction result to the user;

the data preprocessing module is used for preprocessing the original text data to obtain a preprocessing result meeting the prediction requirement and sending the preprocessing result to the deployment module;

the prediction module is used for calling a prediction engine of a deep learning framework, acquiring a prediction result returned after the prediction engine predicts according to the preprocessing result, and sending the prediction result to the deployment module.

A prediction deployment method based on a pre-training paradigm model comprises the following steps:

the method comprises the steps that a prediction deployment system preprocesses original text data corresponding to a task request of a user to obtain a preprocessing result meeting prediction requirements;

and the prediction deployment system calls a prediction engine of a deep learning framework, acquires a prediction result returned after the prediction engine predicts according to the preprocessing result, and provides the prediction result for the user.

An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method as described above.

A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method as described above.

A computer program product comprising a computer program which, when executed by a processor, implements a method as described above.

One embodiment in the above disclosure has the following advantages or benefits: the prediction and deployment can be automatically completed by a prediction deployment system based on a pre-training paradigm model aiming at task requests of users, so that the learning and development cost of the users is reduced, and the processing efficiency is improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic structural diagram illustrating a predictive deployment system 10 according to a first embodiment of the pre-trained paradigm model according to the present disclosure;

FIG. 2 is a schematic diagram illustrating a second embodiment 20 of a predictive deployment system based on a pre-trained paradigm according to the present disclosure;

fig. 3 is a schematic diagram illustrating an interaction manner between the deployment module 101 and the authentication statistics module 104 according to the present disclosure;

FIG. 4 is a schematic diagram of an inheritance relationship of a word cutting operation class according to the present disclosure;

FIG. 5 is a schematic diagram of the prediction module 103 class inheritance relationship according to the present disclosure;

FIG. 6 is a block diagram of a predictive deployment system according to the present disclosure;

FIG. 7 is a flowchart of an embodiment of a predictive deployment method based on a pre-trained paradigm model according to the present disclosure;

FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In addition, it should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

Fig. 1 is a schematic structural diagram of a predictive deployment system 10 based on a pre-trained paradigm model according to a first embodiment of the present disclosure. As shown in fig. 1, includes: a deployment module 101, a data pre-processing module 102, and a prediction module 103.

The deployment module 101 is configured to send original text data corresponding to a task request of a user to the data preprocessing module 102, obtain a preprocessing result returned by the data preprocessing module 102, send the preprocessing result to the prediction module 103, obtain a prediction result returned by the prediction module 103, and provide the prediction result to the user.

The data preprocessing module 102 is configured to preprocess the original text data to obtain a preprocessing result meeting the prediction requirement, and send the preprocessing result to the deployment module 101.

The prediction module 103 is configured to invoke a prediction engine of the deep learning framework, obtain a prediction result returned after the prediction engine performs prediction according to the preprocessing result, and send the prediction result to the deployment module 101.

It can be seen that, in the scheme of the system embodiment, the prediction and deployment and the like can be automatically completed by the prediction deployment system based on the pre-trained paradigm model according to the task request of the user, so that the learning and development cost of the user is reduced, and the processing efficiency and the like are improved.

Besides the deployment module 101, the data preprocessing module 102 and the prediction module 103, the prediction deployment system of the present disclosure may further include an authentication statistics module 104.

Fig. 2 is a schematic structural diagram of a predictive deployment system 20 based on a pre-trained paradigm model according to a second embodiment of the present disclosure. As shown in fig. 2, includes: a deployment module 101, a data preprocessing module 102, a prediction module 103, and an authentication statistics module 104.

The deployment module 101 may send the acquired authentication information of the user to the authentication statistics module 104, and acquire an authentication result returned by the authentication statistics module 104, and if the authentication result is successful, the subsequent processing may be continued, that is, the original text data is sent to the data preprocessing module 102, and the like.

Accordingly, the authentication statistics module 104 can authenticate the user according to the authentication information of the user, and send the authentication result to the deployment module 101.

The following describes specific operation of each of the above modules.

1) Authentication statistics module 104

Fig. 3 is a schematic diagram illustrating an interaction manner between the deployment module 101 and the authentication statistics module 104 according to the present disclosure. As shown in fig. 3, the authentication statistics module 104 may be composed of a management platform 1041, an authentication Server (Server)1042, a database 1043, and the like. The authentication Server 1042 may be an authentication Server built based on a Baidu open source Remote Procedure Call Protocol (BRPC) framework, and may be deployed in a local area network, and the database 1043 may be a MySQL database, where SQL refers to Structured Query Language (Structured Query Language).

The user may log in the management platform 1041 in advance to apply, and accordingly, may assign an Access Key (AK, Access Key ID)/Secret Access Key (SK, Secret Access Key) to the user, and may write the AK/SK assigned to the user into the database 1043.

The authentication Server 1042 can periodically read the database 1043 to obtain all AK/SK therein. The deployment module 101 may send the acquired authentication information of the user, i.e., the AK/SK of the user, to the authentication Server 1042, where the authentication Server 1042 may verify whether the AK/SK of the user is legitimate, e.g., determine whether the AK/SK of the user is located in the AK/SK acquired from the database 1043, if so, the user may be considered legitimate, otherwise, the user may be considered to be illegitimate, and a verification result of whether the AK/SK of the user is legitimate may be sent to the deployment module 101 as an authentication result.

If the authentication result is that the authentication is successful, that is, the user is determined to be a valid user, the deployment module 101 may continue the subsequent processing, that is, the original text data is sent to the data preprocessing module 102, and the like, otherwise, the failure information may be prompted to exit.

Through the processing, the service of the forecasting deployment system can be ensured to be used only by legal users, so that the safety of the forecasting deployment system is improved.

In addition, the authentication statistics module 104 may also write the predetermined information corresponding to the task request into the database 1043, that is, store the predetermined information into the database 1043. For example, the authentication Server 1042 may write the predetermined information corresponding to the task request into the database 1043. The predetermined information may specifically include which information may be determined according to actual needs, for example, information such as task types may be included, so as to facilitate subsequent statistics, for example, statistics of usage of different types of tasks by a user may be performed. How to acquire the predetermined information is not limited.

2) Data preprocessing module 102

The data preprocessing module 102 may preprocess the original text data obtained from the deployment module 101, so as to obtain a preprocessing result meeting the prediction requirement, and return the preprocessing result to the deployment module 101.

Specifically, the data preprocessing module 102 may perform a word cutting operation on the original text data, so as to cut the original text data into a Token (Token) sequence, and then may obtain an identifier (id) sequence corresponding to the original text data according to the Token sequence, and further may generate Tensor (Tensor) structure data according to the id sequence, and use the Tensor structure data as a preprocessing result.

The word segmentation operation of the data preprocessing module 102 may adopt a design manner of a factory model, and may segment the original text data into Token sequences by using a segmenter (Tokenizer) class corresponding to the word segmentation manner corresponding to the original text data. The method comprises the steps of obtaining a cut character string from a configuration file provided by a user, determining a cut mode corresponding to original text data according to the obtained cut character string, and creating a corresponding cut object by using a create _ tokenizer (create _ token) method.

Fig. 4 is a schematic diagram of an inheritance relationship of a word cutting operation class according to the present disclosure. As shown in fig. 4, BaseTokenizer is a base class, and defines an interface for word cutting operation and common function implementation.

The complete word segmenter (FullTokenizer) is used for cutting single words, and performs operations such as uniform code (Unicode) conversion, punctuation mark segmentation, lowercase conversion, chinese character segmentation, accent symbol removal and the like on original text data. For Chinese, the FullTokenizer can segment Chinese characters into individual Chinese characters according to the Unicode coding range of Chinese, and for English, the FullTokenizer can segment English words into subwords (subwords) by using the longest string matching algorithm. The FullTokenizer class is mainly applied to the base (base) and large (large) models of the converter Bidirectional coding Representation (BERT)/information entity Enhanced Language Representation (ERNIE) model.

The word segmentation device (WordTokenizer) class is used for segmenting original text data into word sequences and is mainly suitable for scenes without using a pre-training model BERT/ERNIE.

The segment segmenter (WSSPTokenizer) can call Google sentence block (Google sensory piece) modules for word segmentation, for Chinese, words which often appear together can be combined into a word, for English, English words can be segmented into smaller semantic units, the number of word lists is reduced, and the segment segmenter is mainly suitable for a Tiny (Tiny) model of BERT/ERNIE.

Through the processing, the word cutting operation on the original text data can be accurately and efficiently completed, namely, the original text data is cut into Token sequences, and the word cutting requirements of different service scenes and the like can be met.

And then, acquiring an id sequence corresponding to the original text data according to the Token sequence. For example, the following id sequences corresponding to the original text data can be obtained respectively: token's id sequence (src _ ids), sentence-related id sequence (send _ ids), position-related id sequence (position _ ids), and mask-related id sequence (mask _ ids), etc., which represent the input feature representation. The specific acquisition of which id sequences can be determined according to actual needs, and in addition, how to acquire each id sequence is the prior art.

In practical applications, a batch (batch) processing mode may be adopted, that is, a plurality of original text data are processed simultaneously, the batch size is determined by a batch size (batch _ size) super-parameter, and for different original text data within the same batch, the lengths of corresponding id sequences may be different, in this case, an id sequence with a maximum length may be selected, and other id sequences are padded (padding) to the maximum length. Generally, the lengths of different id sequences corresponding to the same original text data are the same.

Furthermore, the sensor structure data can be generated from the id sequence subjected to padding processing, for example, the sensor structure data of the propeller (PaddlePaddle) can be created, the id sequence subjected to padding processing is copied into the data (data) of the sensor, and the dimension (shape) of the id sequence is set. The obtained Tensor structure data can be used as a preprocessing result of the original text data and sent to the deployment module 101.

3) Prediction module 103

The prediction module 103 may obtain a preprocessing result of the original text data from the deployment module 101, call a prediction engine of the deep learning framework, obtain a prediction result returned after the prediction engine predicts according to the preprocessing result, and send the prediction result to the deployment module 101.

The deep learning framework can be a paddlepaddley deep learning framework and the prediction engine can be an analytical predictor (AnalysisPredictor) prediction engine.

The PaddlePaddle provides a perfect C + + Application Programming Interface (API) for a hundred-degree research and development and open-source deep learning framework. The analystpredictor is a high-performance prediction engine, and the prediction performance can be greatly improved through a series of optimization processing.

The prediction module 103 can call an analystpredictor prediction engine of the paddlepaddley deep learning framework, the initialization stage can be responsible for loading a BERT/ERNIE model, creating an analystpredictor object, calling a clone interface of the analystpredictor to create an analystpreprector copy for each thread, and the like, and the prediction stage can call a Run interface (Run) of the analystpredictor to execute forward calculation and obtain an output Tensor, i.e. a prediction result.

In addition, the prediction module 103 may adopt a design manner of a factory model, and may obtain a prediction result by using a prediction (refer) class corresponding to a task type corresponding to the original text data. The task type character string to be predicted can be obtained from the configuration file, the task type corresponding to the original text data is determined according to the obtained task type character string, and a creation _ prediction (create _ inference) method can be called to create a corresponding prediction object.

Fig. 5 is a schematic diagram of the inheritance relationship of the prediction module 103 in the present disclosure. As shown in fig. 5, baselnfer is a base class, a classification prediction (ClassifyInfer) class is responsible for prediction of a text classification task, a matching prediction (matchinglnfer) class is responsible for prediction of a text matching task, a sequence prediction (SequenceInfer) class is responsible for prediction of a sequence tagging task, and a generation prediction (generatelnfer) class is responsible for prediction of a text generation task. In addition, different inser classes may also respectively execute corresponding post-processing logic on the output Tensor, and the specific processing may be determined according to actual needs.

Text classification, text matching, sequence labeling, text generation and the like are common natural language processing tasks. The different types of tasks may also have different corresponding data pre-processing and post-processing logic, etc.

Through the processing, the prediction processing can be accurately and efficiently completed, and the prediction requirements of different types of tasks can be met.

4) Deployment module 101

The deployment module 101 may send the original text data corresponding to the task request of the user to the data preprocessing module 102, obtain the preprocessing result returned by the data preprocessing module 102, send the preprocessing result to the prediction module 103, obtain the prediction result returned by the prediction module 103, and provide the prediction result to the user, and the like.

The predictive deployment system of the present disclosure may support the following deployment modes: a C + + API mode, a BRPC Server mode, and an offline tool mode.

1) C + + API schema

The predictive deployment system framework core code logic may be compiled to generate a dynamically linked library (. so) file and packaged with the header file of the interface. When other projects are used, the compiling stage introduces (include) the header file, the linking stage unites the so file and generates the executable program, and the running stage puts the so file under the same path of the executable program.

2) BRPC Server mode

In order to meet the service requirements of an online prediction scene and achieve decoupling of user project codes and prediction deployment module codes, a hypertext Transfer Protocol (HTTP) Server can be built based on a Baidu open-source BRPC framework. BRPC is an excellent RPC framework with high performance and complete functions, can fully utilize machine resources, and ensures efficient and stable operation of services. The BRPC Server is responsible for analyzing the task request of the user, calling the relevant prediction interface to predict, and returning the prediction result to the user. The user is used as a client (client) and requests the BRPC Server in an HTTP mode.

3) Offline tool mode

An offline prediction tool is provided for meeting the business requirements of offline batch prediction scenes. When a user has a large amount of offline data to be predicted, an offline prediction tool can be called to perform offline prediction. The offline prediction tool can read user input data (such as original text data) into a memory, start a plurality of working threads, call a relevant prediction interface for prediction by each working thread, and write a prediction result into a result file. The offline tool model is particularly well suited for Map Reduce (Map Reduce) tasks in distributed systems (Hadoop).

The prediction deployment system disclosed by the invention can be suitable for different business scenes such as C + + co-compilation, online service, offline batch prediction and the like, and has wide applicability.

Additionally, the predictive deployment system of the present disclosure may support the following hardware types: a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), and a quincuncial chip. Namely, the prediction deployment system can run on a CPU, a GPU and a Kunlun chip, so that the deployment requirements under various different hardware scenes are met.

With the above introduction in mind, fig. 6 is a schematic diagram of a framework of the predictive deployment system according to the present disclosure, and for specific implementation, reference is made to the foregoing related description, which is not repeated.

The foregoing is a description of system embodiments, and the following is a further description of the aspects of the disclosure by way of method embodiments.

Fig. 7 is a flowchart of an embodiment of a predictive deployment method based on a pre-trained paradigm model according to the present disclosure. As shown in fig. 7, the following detailed implementation is included.

In step 701, the prediction deployment system preprocesses the original text data corresponding to the task request of the user to obtain a preprocessing result meeting the prediction requirement.

In step 702, the prediction deployment system invokes a prediction engine of the deep learning framework, and obtains a prediction result returned after the prediction engine performs prediction according to the preprocessing result, and provides the prediction result to the user.

In addition, the forecast deployment system can also authenticate the user according to the authentication information of the user, and if the authentication result is successful, the subsequent processing can be continued, such as preprocessing the original text data.

The forecast deployment system can also store the preset information corresponding to the task request into a database. The predetermined information may specifically include which information may be determined according to actual needs, for example, information such as task types may be included, so as to facilitate subsequent statistics, for example, statistics of usage of different types of tasks by a user may be performed.

The specific manner of preprocessing the original text data may include: the method comprises the steps of carrying out word cutting operation on original text data, cutting the original text data into Token sequences, obtaining id sequences corresponding to the original text data according to the Token sequences, generating Tensor structure data according to the id sequences, and taking the Tensor structure data as a preprocessing result.

And aiming at the obtained preprocessing result, the prediction deployment system can call a prediction engine of the deep learning framework, and obtain a prediction result returned after the prediction engine predicts according to the preprocessing result. The deep learning framework may be a paddlepaddley deep learning framework and the prediction engine may be an AnalysisPredictor prediction engine.

In addition, the predictive deployment system may support the following hardware types: CPU, GPU and Kunlun chip.

The predictive deployment system may also support the following deployment patterns: a C + + API mode, a BRPC Server mode, and an offline tool mode, etc.

It is noted that while for simplicity of explanation, the foregoing method embodiments are described as a series of acts, those skilled in the art will appreciate that the present disclosure is not limited by the order of acts, as some steps may, in accordance with the present disclosure, occur in other orders and concurrently. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required for the disclosure.

For a specific work flow of the method embodiment shown in fig. 7, reference is made to the related description in the foregoing system embodiment, and details are not repeated.

In a word, by adopting the scheme of the embodiment of the method, the prediction, the deployment and the like can be automatically completed by the prediction deployment system based on the pre-training paradigm model aiming at the task request of the user, so that the learning and development cost of the user is reduced, the processing efficiency is improved, and the like.

The scheme disclosed by the invention can be applied to the field of artificial intelligence, in particular to the fields of deep learning, natural language processing, computer vision and the like.

Artificial intelligence is a subject for studying a computer to simulate some thinking processes and intelligent behaviors (such as learning, reasoning, thinking, planning and the like) of a human, and has a hardware technology and a software technology, the artificial intelligence hardware technology generally comprises technologies such as a sensor, a special artificial intelligence chip, cloud computing, distributed storage, big data processing and the like, and the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge graph technology and the like.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The computing unit 801 performs the various methods and processes described above, such as the methods described in this disclosure. For example, in some embodiments, the methods described in this disclosure may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When loaded into RAM 803 and executed by computing unit 801, may perform one or more steps of the methods described in the present disclosure. Alternatively, in other embodiments, the computing unit 801 may be configured by any other suitable means (e.g., by means of firmware) to perform the methods described by the present disclosure.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A predictive deployment system based on a pre-trained paradigm model, comprising: the system comprises a deployment module, a data preprocessing module and a prediction module;

the deployment module is used for sending original text data corresponding to the task request to the data preprocessing module aiming at the task request of the user, acquiring a preprocessing result returned by the data preprocessing module, sending the preprocessing result to the prediction module, acquiring a prediction result returned by the prediction module and providing the prediction result to the user;

the data preprocessing module is used for preprocessing the original text data to obtain a preprocessing result meeting the prediction requirement, and comprises: obtaining a word cutting type character string from a configuration file provided by a user, determining a word cutting mode corresponding to original text data according to the word cutting type character string, utilizing a Tokenizer class corresponding to the word cutting mode to cut the original text data into Token sequences, obtaining an identification id sequence corresponding to the original text data according to the Token sequences, copying the id sequence into created Tensor structure data, and sending the Tensor structure data serving as the preprocessing result to the deployment module;

2. The system of claim 1, further comprising: an authentication statistical module;

the deployment module is further used for sending the acquired authentication information of the user to the authentication statistical module, acquiring an authentication result returned by the authentication statistical module, and if the authentication result is successful, sending the original text data to the data preprocessing module;

the authentication statistical module is used for authenticating the user according to the authentication information of the user and sending an authentication result to the deployment module.

3. The system of claim 2, wherein,

the authentication statistical module is further used for storing the preset information corresponding to the task request into a database.

4. The system of claim 1, wherein,

the word cutting operation of the data preprocessing module adopts a factory mode design mode.

5. The system of claim 1, wherein,

the deep learning framework includes: a PaddlePaddle deep learning framework;

the prediction engine includes: the analytic predictor predicts engine.

6. The system of claim 1, wherein,

and the prediction module adopts a design mode of a factory mode and obtains the prediction result by using the prediction Infer class corresponding to the task type corresponding to the original text data.

7. The system of claim 1, wherein,

the system supports the following hardware types: a central processing unit CPU, a graphic processing unit GPU and a Kunlun chip.

8. The system of claim 1, wherein,

the system supports the following deployment modes: a C + + application program interface API mode, a Baidu open source remote procedure call protocol Server BRPC Server mode and an offline tool mode.

9. A prediction deployment method based on a pre-training paradigm model comprises the following steps:

aiming at a task request of a user, a prediction deployment system preprocesses original text data corresponding to the task request to obtain a preprocessing result meeting prediction requirements, and the preprocessing result comprises the following steps: acquiring a word cutting type character string from a configuration file provided by a user, determining a word cutting mode corresponding to original text data according to the word cutting type character string, cutting the original text data into Token sequences by utilizing a Tokenizer class corresponding to the word cutting mode, acquiring an identification id sequence corresponding to the original text data according to the Token sequences, copying the id sequence into created Tensor tenser structure data, and taking the tenser structure data as the preprocessing result;

10. The method of claim 9, further comprising:

and the prediction deployment system authenticates the user according to the authentication information of the user, and if the authentication result is successful, the original text data is preprocessed.

11. The method of claim 10, further comprising:

and the forecast deployment system stores the preset information corresponding to the task request into a database.

12. The method of claim 9, wherein,

the deep learning framework includes: a PaddlePaddle deep learning framework;

the prediction engine includes: the analytic predictor predicts engine.

13. The method of claim 9, wherein,

the predictive deployment system supports the following hardware types: a central processing unit CPU, a graphic processing unit GPU and a Kunlun chip.

14. The method of claim 9, wherein,

the predictive deployment system supports the following deployment modes: a C + + application program interface API mode, a Baidu open source remote procedure call protocol Server BRPC Server mode and an offline tool mode.

15. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 9-14.

16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of any one of claims 9-14.