CN117032643A - Intelligent informatization system based on AI large language model and construction method - Google Patents

Intelligent informatization system based on AI large language model and construction method Download PDF

Info

Publication number
CN117032643A
CN117032643A CN202311023967.7A CN202311023967A CN117032643A CN 117032643 A CN117032643 A CN 117032643A CN 202311023967 A CN202311023967 A CN 202311023967A CN 117032643 A CN117032643 A CN 117032643A
Authority
CN
China
Prior art keywords
content
module
user
information
subsystem
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311023967.7A
Other languages
Chinese (zh)
Inventor
朱玮
杨波
沈峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ignorance Beijing Smart Technology Co ltd
Original Assignee
Ignorance Beijing Smart Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ignorance Beijing Smart Technology Co ltd filed Critical Ignorance Beijing Smart Technology Co ltd
Priority to CN202311023967.7A priority Critical patent/CN117032643A/en
Publication of CN117032643A publication Critical patent/CN117032643A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/20Software design
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to the technical field of large model application, and particularly discloses an intelligent informatization system and a construction method based on an AI large language model, wherein the system comprises an intelligent informatization subsystem, a global routing platform and a content providing module; the global routing platform is used for providing information routing service for the intelligent informatization subsystem; the content providing module is used for providing specified content; the intelligent informatization subsystem comprises: a dialogue module for users to input dialogue content; the large model module comprises a multi-mode large model and is used for analyzing user instructions and user contents from dialogue contents, analyzing information required to be acquired by a user and sorting the information into characteristic data; the content access module is used for acquiring user instructions and characteristic data; the execution module is used for executing corresponding operation according to the user instruction; and the information acquisition module is used for acquiring the content from the external website according to the characteristic data. The technical scheme of the application has intelligent capability and can help individuals and enterprise users to process related data and information.

Description

Intelligent informatization system based on AI large language model and construction method
Technical Field
The application relates to the technical field of large model application, in particular to an intelligent informatization system and a construction method based on an AI large language model.
Background
The current informatization system is divided into independent software, application and website according to the function of the service user. Each of the information systems provides services to users around a topic, such as resource planning by ERP as an enterprise, customer management by CRM as an enterprise, and online sales by e-commerce websites. Whether an enterprise or an individual, may choose to operate different information systems while doing different tasks. Most of the current application modes of informatization systems are based on formatted forms, wherein each field in the form represents a contracted meaning of data. The user needs to be aware of the meaning of these fields, careful operation, and high learning costs.
Moreover, after years of development, the mode of the information system is very mature, and the construction mode and the operation method of the information system are not changed greatly although the technology is updated continuously. The building mode is usually subjected to steps of demand analysis, system design, code development, testing, deployment and the like, wherein the steps are further subdivided into a plurality of sub-steps, and each step needs to generate a plurality of documents for communicating demands among developers. The whole process is high in cost, time-consuming and labor-consuming.
In the current informationized system mode, a user is exposed to a large amount of information and data, including files, system data, internet web content, etc., from which it is difficult for the user to find the content he needs. In particular, the internet mode has further driven the dissemination and production of content, but has also been inundated with endless content.
In addition to the information system, a traditional personal information application is a data-based access and processing method for individual users. For example, to manage a schedule, there is a piece of daily management software that keeps track of activities on the calendar and requires the application to remind on time; a meeting record is recorded, a notebook is used, the meeting record is stored after recording, and searching is carried out when needed; to access sports news, a browser is used to browse on a search engine and various news websites. In summary, the traditional personal information application mode has the following characteristics:
firstly, a user needs to use a plurality of different applications, information and functions are scattered on tens or even hundreds of applications, and the use is inconvenient;
secondly, various information of the user is stored on a huge hard disk, files are in a plurality of forms, and the information is very difficult to find when the information is needed;
third, the method of obtaining external information, such as news, by the user is time-consuming and labor-consuming, and so-called "surfing" is effectively a random browsing. Under the condition that the information searching target is clear, the efficiency is very low;
fourth, the arrangement of the user information is very troublesome, and in many cases, the user does not arrange the information and searches only when the information is needed according to keywords;
fifth, when a user needs a special information system application, he must search around, and if he cannot find it, he needs to write code development, and the cost of code development is high.
With the advent of large models like ChatGPT, the understanding ability of AI is jumping, AI starts to generate language fluently, can compete with tasks such as semantic understanding, and keeps memory and logic processing ability in dialogue, and even can generate program codes according to human language requirements. The intelligence of the large model brings possibility for the mode change of the informatization system, and the operation method, the use method and the construction method of the informatization system can generate fundamental change under the support of the large model.
Therefore, there is a need for an AI-based large language model intelligent informatization system and method of construction that has intelligent capabilities that can assist individuals and business users in processing relevant data and information.
Disclosure of Invention
One of the purposes of the application is to provide an intelligent informatization system based on an AI large language model, which has intelligent capability and can help individuals and enterprise users to process related data and information.
In order to solve the technical problems, the application provides the following technical scheme:
an intelligent informatization system based on an AI large language model comprises an intelligent informatization subsystem, a global routing platform and a content providing module;
the global routing platform is used for providing information routing service for the intelligent informatization subsystem;
the content providing module is used for providing specified content;
the intelligent informatization subsystem comprises a registration account opening module, a large model module, a dialogue module, a content access module, an execution module, an information characteristic module, a content providing module, an information acquisition module and an information base;
the registration account opening module is used for providing a unique address on the Internet after the user registers for account opening;
the dialogue module is used for inputting dialogue content by a user;
the large model module comprises a multi-mode large model which is used for analyzing user instructions and user contents from dialogue contents; the method is also used for analyzing information required to be acquired by a user from the dialogue content and sorting the information into characteristic data;
the content access module is used for acquiring user instructions and transmitting the user instructions to the execution module for execution; the method is also used for acquiring characteristic data;
the execution module is used for executing corresponding operation according to the user instruction;
the information feature module is used for storing feature data;
the information acquisition module is used for acquiring content from an external website according to the characteristic data;
the information base is used for storing the content from the information acquisition module and the content providing module; and also for storing data from the execution module.
Further, the large model module is also configured to provide artificial intelligence capabilities including: conversations, code generation, function calls, ebadd generation, json generation, translation, lexical entity recognition, emotion classification, text classification, and abstract generation.
Further, the dialogue module includes an interface for man-machine dialogue, and the dialogue content includes: text dialogs, files, tables, codes, graphics, and videos.
Further, the execution module comprises a file system, a database, a script/library environment and a plurality of interface programs;
the interface program comprises a software system and an application program which are deployed locally or at the cloud;
the content access module is also used for analyzing the multi-mode large model to obtain user content and storing the user content into a file system or a database; user content includes text, json strings, vectors, SQL, and program code;
the content access module is also used for calling the file system and the database to execute basic operation, wherein the basic operation comprises newly-built folders or newly-built database tables;
the script/library environment is used to deploy script code from multimodal big model generation.
Further, the information feature module is further used for providing feature data for visitors to access;
the characteristic data includes: and the natural language form adopts constraint conditions and vectors constructed by logic symbols.
Further, the global routing platform is used for externally disclosing the access address of the intelligent informatization subsystem after the intelligent informatization subsystem is registered, so that the content providing module can access and push the content;
the global routing platform is also used for matching the content provided by the content providing module with the characteristic data of the information required by the intelligent informatization subsystem.
Further, the global routing platform is further used for sorting all the contents from the content providing module, including sorting out content classification, content keywords and content abstracts;
and the content matching module is also used for performing content matching according to the characteristic data of the intelligent informatization subsystem, and if the matching is successful, transmitting the content to the registered address of the intelligent informatization subsystem.
Further, the address provided by the registration account opening module is an API address, including an API address I, an API address II, an API address III and an API address IV;
the API address is used for informing the visitor of the basic situation of the intelligent informatization subsystem, wherein the basic situation comprises the purpose of the intelligent informatization subsystem and the function provided by the intelligent informatization subsystem;
the API address II is used for informing the information acquisition module and the content providing module of information required by the intelligent informatization subsystem;
the API address is used for receiving the content sent by the content providing module; the data format of the content includes TXT, JSON, HTML;
the API address four is used for an external visitor user or an external AI large model to initiate a dialogue with the intelligent informatization subsystem.
The second object of the present application is to provide a method for constructing an intelligent informatization system based on an AI large language model, which is applied to the above system, and comprises the following steps:
s1, receiving registration account opening of a user through a registration account opening module, and providing a unique address on the Internet;
s2, receiving dialogue contents input by a user through a dialogue module;
s3, analyzing a user instruction and user content from the dialogue content through a multi-mode large model of the large model module; analyzing information required to be acquired by a user from the dialogue content, and sorting the information into characteristic data;
s4, acquiring a user instruction through the content access module, and delivering the user instruction to the execution module for execution; feature data is also acquired;
s5, the execution module executes corresponding operation according to the user instruction;
s6, storing the characteristic data through an information characteristic module;
s7, the information acquisition module acquires content from an external website according to the characteristic data;
s8, providing information routing service for the intelligent informatization subsystem through the global routing platform;
s9, the content providing module provides specified content for the global routing platform; the global routing platform performs content matching according to the characteristic data of the intelligent informatization subsystem, and if the matching is successful, the global routing platform sends the content to the registered address of the intelligent informatization subsystem;
s10, storing contents from the information acquisition module and the content providing module through an information base; and storing the data from the execution module.
In step S3, the multimodal big model analyzes the user instruction that needs to be stored, queried or executed by the program in the dialogue content, converts the corresponding natural language in the dialogue content into a user instruction of a formatted language, sends the user instruction to the content access module, and informs the content access module of the execution module that needs to be called.
The application has the remarkable effects that: the scheme is supported by the understanding force of the multi-mode large model, a user does not need to interact with the systems of a plurality of independent topics, and interacts with a plurality of functional systems in a unique mode of dialogue with the large model. The user does not need to learn forms and fields of the informatization system, the informatization system is hidden behind a dialogue box of the dialogue module, and the user can operate the intelligent informatization subsystem only by using natural language. The construction of the intelligent informatization subsystem can be completed by the natural language expression of the user under the support of the large model, namely, the user can complete the development, deployment and automatic operation of the system code through describing the requirement by the multi-mode large model. Under the support of the multi-mode large model, the scheme can automatically collect information according to the instruction of the user, help the user to arrange and provide the most accurate information for the user.
In summary, the scheme has intelligent capability and can help individual users to process data and information related to individuals.
Drawings
FIG. 1 is a logic block diagram of an intelligent informatization system based on an AI large language model according to an embodiment;
FIG. 2 is a schematic diagram of a dialog box in an intelligent informatization system based on an AI large language model according to an embodiment;
FIG. 3 is a schematic diagram of content access and execution in an intelligent informatization system based on an AI large language model according to an embodiment;
FIG. 4 is a schematic diagram of automated generation and deployment of a program in an intelligent informatization system based on an AI large language model according to an embodiment;
fig. 5 is a schematic diagram of a global routing platform in an intelligent informatization system based on an AI large language model according to an embodiment.
Detailed Description
The following is a further detailed description of the embodiments:
example 1
As shown in fig. 1, an intelligent informatization system based on an AI large language model of the present embodiment includes an intelligent informatization subsystem, a global routing platform and a content providing module;
the intelligent informatization subsystem comprises a registration account opening module, a large model module, a dialogue module, a content access module, an execution module, an information characteristic module, an information acquisition module and an information base.
The registration account opening module is used for providing a unique address on the Internet after the user registers and opens an account by adopting the technologies of Internet IP, a domain name system, a P2P network and the like, and the address can be used for identifying and accessing the user;
in this embodiment, a global address system of an intelligent informatization subsystem is constructed based on the IP and domain name system of the internet, and includes four basic API addresses:
and the API address I is used for informing the visitor when providing services to the outside, and the basic condition of the intelligent informatization subsystem comprises the purpose of the intelligent informatization subsystem and the function provided by the intelligent informatization subsystem.
And the API address II is used for informing the crawler and an external content providing module of which information is needed by the intelligent informatization subsystem.
In this embodiment, two methods are used to describe the information requirement:
first, adopting Json file to describe the key words of the requirement, comprising simple logic;
examples of demand keywords described by the Json file are as follows:
[ "category": "technology",
“keywords”:
{ "name": "AI big model",
“similarity”:“0.8”,
“exclude”:“None”},
{ "name": "information system",
“similarity”:“0.7”,
"include" information System "})
Second, complex language requirements are described using vector files.
When the complex language requirement is described by adopting a vector file, the vector technology is based on an Embedding technology of a multi-mode large model, specifically, text-Embedding-ada-002 of ChatGPT is adopted, and the dimension of an output vector is 1536.
And the API address III is used for receiving the content sent by the external content providing module. The data format of the content includes TXT, JSON, HTML.
And the API address IV is used for providing external dialogue communication, namely an external visitor user or an external AI large model, and can initiate dialogue with the intelligent informatization subsystem. This chat may communicate information defined by the system user.
And the large model module is used for providing artificial intelligence capability and comprises a multi-mode large model, wherein the multi-mode large model can be accessed through an API (application program interface) mode and can also be deployed locally. The parameter scale of the selected multi-mode large model is required to be over trillion, and the parameter scale is evaluated on translation, contextual memory, code generation, mathematical problem solution, multidisciplinary problem solution, reading understanding and various qualification tests to reach a set level.
In this embodiment, the multi-mode large model adopts a large model of OpenAI, and uses an API interface for access, and the used model includes: gpt-3.5-turbo, text-casting-ada-002. Artificial intelligence capabilities for applying multi-modal large models include: conversations, code generation, function calls, ebadd generation, json generation, translation, lexical entity recognition, emotion classification, text classification, abstract generation, and the like.
A dialog module comprising an interface for man-machine dialog that is capable of presenting various forms of dialog content, such as text dialog, files, tables, codes, graphics, video, etc. The interface may be based on Web or various client technologies such as Windows and iOS. In this embodiment, a dialog box is shown in fig. 2, and the dialog box has the following components:
an input box for a user to input dialogue content;
the display box is used for displaying the multi-mode big model and the user dialogue content;
the data browser is used for displaying various data, wherein the data comprises text data, icon data, graphic data and formatted data; formatting data such as tables, forms, and the like; the data browser is also used for executing input operation and clicking various button operations;
the labels are page labels of a plurality of data browsers and are used for switching among pages of the data browsers after clicking.
In this embodiment, the dialog content entered in the dialog box by the user is submitted to the multimodal large model through the API interface, and the result generated by the multimodal large model is also transferred to the dialog box through the API interface.
In this embodiment, two multi-modal large models (i.e., two large model session processes) are set, where one multi-modal large model (large model one) is used for session and content generation, and the other multi-modal large model (large model two) is used for judging whether there is functional content such as information storage, information query, function call, etc. in the session.
When the large model II judges that functional content exists in one dialogue, the large model I carries out function analysis on the dialogue, analyzes functions and parameters to be called, and sends the functions and parameters to the content access module through an API interface;
and the content access module can call various information processing methods and modules and is used for obtaining user instructions and user contents which are analyzed by the multi-mode large model from the dialogue input by the dialogue box user through the API interface of the multi-mode large model, wherein the user instructions comprise texts, json strings, vectors, SQL, program codes and the like. The content access module is also used for transmitting the user instruction analyzed by the multi-mode large model to the execution module for execution according to the agreed format.
In this embodiment, the interface module calls the external function interface according to the requirement of the multi-mode large model, and transfers the parameters. In this embodiment, all the interface modules are executed, encapsulated in the content access module.
The content access module is between the multimodal mass model and the content storage and execution. That is, the multimodal large model finds out a part to be stored or a part to be queried and a part to be executed by a program in the dialogue content, and converts a natural language from the dialogue content into a format language such as Json, SQL, linux command, and sends the format language to the content access module, and informs the content access module of an execution module to be called. That is, the multimodal big model can parse content in natural language into Json strings for function calls, for example, in the following format:
{‘role':‘assistant',
‘content':None,
‘function_call':
{'name':'get_temperature',
'arguments':
'n' location ',' Beijing city ',' n
‘format’:‘celsius’}'
}
}
As shown in FIG. 3, the execution module includes a file system, a database, a script/library environment, and several interface programs.
The interface program comprises various software systems and application programs deployed locally or in the cloud; can be called by the content access module. The interface program can be a Rest style API, and can also be an interface in other custom formats.
The method specifically comprises the following steps: an application deployed at a local server; automatically writing by the large model, and automatically deploying an application program in a script environment in the subsystem; applications on other remote servers;
the file system and the database are special interface programs, and refer to the operation of the file system and the local database of the server where the subsystem is deployed. In user conversations, storing and retrieving content is a conventional and frequent operation, so the present subsystem is designed to be a special class of interface programs. Operations on the file system include: adding, deleting and checking folders and adding and deleting and checking files; the local database includes: various databases such as a relational database, a vector database, a graph database, and the like.
The content access module is also used for storing the user content analyzed by the multi-mode big model from the dialog box into a file system or a database;
the content access module is also used for calling the file system and the database to perform basic operations, such as creating a folder or creating a database table;
in this embodiment, the file system operates by the os module of python to perform the following functions:
os.mkdir, create directory
os.rmdir, delete directory
os. Listdir (path), listing all directories and files
os.open (", mode= 'w'), open and create file
os. Remove (path), delete file
The script/library environment is used for being called by the content access module, is used for deploying script codes generated by the multi-mode big model and establishing a database; the database is one of databases of the execution module, and the databases of the execution module also comprise databases manually established by other traditional methods.
After deployment is completed, the script code can directly run to provide functions for the multi-mode large model and the dialog box;
the content that the multi-modal large model needs to generate includes: script program, database and table SQL, script deployment command and script operation command; after the multi-mode large model generates the relevant codes and commands of the programs, the codes and commands are transmitted to a local calling interface program of the deployment environment for execution. The whole process does not need manual intervention, and all development, deployment and operation works are completed only in a dialogue with a user;
in this embodiment, the python environment, specifically the python 3.9.0 version, is deployed in the unbuntu linux environment. In this embodiment, a mysql database is also deployed, specifically mysql 8.0 in the unbuntu linux environment.
In this embodiment, as shown in fig. 4, from the dialogue content between the multimodal big model and the user, the following is resolved: code scripts, database scripts, deployment commands, and run commands;
the code script is a python code program, and in the code script, an installation script of a python dependency package, namely a 'pip install dependency package', needs to be analyzed; the database script is a database and table building script of the database; the deploy command saves the python file under the directory of the python environment; the operation command is: "python filename. Py".
And the information characteristic module is used for storing characteristic data of information required to be acquired by a user. The profile data records a description of the user's need to collect information for the subsystem. The information required to be acquired by the user is extracted from the dialogue content of the dialogue box, and after the multi-mode large model is analyzed, the information is arranged into characteristic data, and the characteristic data is stored into the information characteristic module through the content access module.
The information feature module provides feature data in the form of files or databases for visitors to access, so that the visitors can conveniently know information required to be collected by one subsystem. Functionally resembling the Robot file of a website.
The characteristic data includes the following forms:
feature data in natural language form, such as keywords, or sentences describing the requirements;
based on natural language, adopting constraint conditions constructed by logic symbols;
and vectors generated for natural language by adopting technologies such as large models, BERT, TF-IDF and the like.
In the present embodiment, the feature data is stored in two ways: storing information classification and keywords through Json files; language descriptions of vector storage features also generated by the multimodal mass model;
the two modes are respectively stored into files, and the two characteristic files are in parallel relation, namely, any one of the two characteristic files can be satisfied.
The storage format of the Json file is as follows:
[ "category": "technology",
“keywords”:
{
"name": "AI big model",
“similarity”:“0.8”,
“exclude”:“None”
},
{
"name": "information system",
“similarity”:“0.7”,
' include ' information system '
}]
Wherein category is the overall classification of the collected content, in this embodiment, classification is science and technology, keywords are Keywords that are required to be contained in the collected content, similarity is a threshold that is required to be reached by the similarity, and include is what Keywords are not permitted to be contained.
The language description of the vector storage characteristic generated by the multi-mode large model is generated by adopting a text-embedding-ada-002 model in the embodiment, and can be understood as the information requirement V of a natural language description, and the vector V' is generated by a vector mapping function f of the text-embedding-ada-002 model. Namely:
V'=F(V)
all requirements, constitute an array:
T=[V‘ 1 ,V' 2 ,……V' n ]
for conflict among the features, a priority method is adopted to solve, namely, the front features in the array are prioritized. In this embodiment, the judgment is made by the multi-modal large model and Langchain.
A content providing module for providing specified content, including one or more of a website, an external cloud service, or a large model-based content distribution system. In this embodiment, cooperation with four industry news websites is achieved, and the four industry news websites are used as a content providing module and are accessed to the global routing platform. The four news industries are automotive, scientific, sports, and entertainment.
And the information acquisition module is used for acquiring the content from the external website according to the requirements of the information characteristic module. In this embodiment, 4 crawlers are constructed to crawl the content on microblog, fox search news, internet news, hundred degree news, specifically crawl news on channels such as science and technology, education, time administration, entertainment, sports, automobiles, and real estate.
The information base is used for storing the contents from the information acquisition module and the content providing module, namely various files and data; and also for storing data from the execution module. In this embodiment, the information storage means includes: a file system, i.e. a file system based on the unbuntu operating system stores various files; a database, i.e. a mysql-based database, stores various formatted data; the vector database, i.e. based on pincone vector database, stores various vectorized corpus data.
The global routing platform is a component part of the system and operates independently; the system is used for providing information routing service for all intelligent informatization subsystems, namely after a plurality of intelligent informatization subsystems are registered, the access address of the intelligent informatization subsystems is externally disclosed, and the intelligent informatization subsystems are used for being accessed by a content providing module and pushing content; and the system is also used for preparing a multi-mode large model on the global routing platform, and matching the content provided by the content providing module with the characteristics of the information required by the intelligent informatization subsystem.
If a deployed intelligent informatization subsystem needs to acquire external information, registering a global address in a global routing platform, wherein an information characteristic module needs to disclose information characteristics of the global routing platform;
in this embodiment, as shown in fig. 5, a global routing platform is constructed, and four independent intelligent informationized subsystems are respectively registered on the global routing platform and submitted to the information requirements of the global routing platform. Four content providing modules provide content for the global routing platform.
In this embodiment, an interface service of a multimodal big model, for example, a chatGPT big model, is deployed on the global routing platform, and all the contents from the content providing module are sorted, including sorting out content classification, content keywords and content summaries; and then the global routing platform performs content matching according to the characteristic data of the four intelligent informatization subsystems, and if the matching is successful, the content is sent to the interface API address III corresponding to the intelligent informatization subsystem.
For example, after the content of an article is processed by a large model of the global routing platform, the content is classified as c, and the content keyword is k= [ k ] 1 ,k 2 ,…k n ]The vector of the content digest is v. The content in the Json file of the demand characteristic of the intelligent informatization system t is classified as C t The key words areThe feature vector of the content demand isThe following condition judgment is carried out:
{c=C t &k i ∈K t }||v∈V t
wherein, in order to judge k i ∈K t In the embodiment, a pre-trained language model Bert is adopted to calculate K and K t Word vector of the medium elements:
adopt cosine similarity to calculate similarity, letThe cosine similarity of A and B is:
v and V t Both are generated by the vector generation function of the large model, and the comparison of the two is performed by the vector database pincone or can be performed by a cosine similarity algorithm.
If the condition is met, the global routing platform sends the information content to the intelligent informatization subsystem t.
When the conditions are judged to have conflicts, the priority is given to the conditions according to the sequence numbers of the conditions. For example, if in a feature vector of a content requirement And->Conflict, then->Cover->
The embodiment also provides a construction method of the intelligent informatization system based on the AI large language model, which is applied to the system and comprises the following contents:
s1, receiving registration account opening of a user through a registration account opening module, and providing a unique address on the Internet;
s2, receiving dialogue contents input by a user through a dialogue module;
s3, analyzing a user instruction and user content from the dialogue content through a multi-mode large model of the large model module; analyzing information required to be acquired by a user from the dialogue content, and sorting the information into characteristic data; in this embodiment, in the dialog content, the multimodal big model analyzes a user instruction that needs to be stored, needs to be queried, or needs to be executed by a program, converts a corresponding natural language in the dialog content into a user instruction of a formatted language, sends the user instruction to the content access module, and informs the content access module of an execution module that needs to be called.
S4, acquiring a user instruction through the content access module, and delivering the user instruction to the execution module for execution; feature data is also acquired;
s5, the execution module executes corresponding operation according to the user instruction;
s6, storing the characteristic data through an information characteristic module;
s7, the information acquisition module acquires content from an external website according to the characteristic data;
s8, providing information routing service for the intelligent informatization subsystem through the global routing platform;
s9, the content providing module provides specified content for the global routing platform; the global routing platform performs content matching according to the characteristic data of the intelligent informatization subsystem, and if the matching is successful, the global routing platform sends the content to the registered address of the intelligent informatization subsystem;
s10, storing contents from the information acquisition module and the content providing module through an information base; and storing the data from the execution module.
By using the scheme of the embodiment, a user can obtain a unique address which can be globally accessed in the Internet for the personal intelligent informatization subsystem of the user by registering and opening an account; the user dialogues with the multi-mode large model through the dialog box, the user dialogues are processed by the multi-mode large model, and the multi-mode large model extracts content and instructions for processing information and applications required by the user; content and instructions are distributed by the program to different execution flows; the content and instructions may be stored on a file system, may be stored on a database, or may be deployed as an executable program; features in the user session that require information to be collected are stored in the system; setting an independent public address global routing platform for each running intelligent informatization subsystem to register on and issue and collect information characteristics; various content providing modules, such as websites, cloud services and the like, can issue information to the addresses of various intelligent informatization subsystems according to the information requirement characteristics; setting a crawler to actively grasp contents meeting characteristic requirements; store all content to the information database.
This embodiment brings the following benefits:
1. the unique application interface is based on a dialog box of the multi-mode large model, and does not need to interact with various applications;
2. having dialogue intelligence, can be through natural language and user's dialogue; the user does not need to use various formatted operation interfaces, such as filling in a data form, but directly uses a dialog box to inform the user of the requirements;
3. various functions of the system, various data, are provided through dialog boxes. For example, the user in a dialog box notifies the multimodal big model that 7 pm has a meeting in the office, please advance 1 day for a reminder. The multimodal big model can understand the event and store it in the daily schedule application system and remind it on time;
4. the multi-mode big model can bear the information arrangement work of the user, and all files and data are arranged and stored after the key words and the abstract are arranged after being understood by the multi-mode big model. What information is needed by the user, and the multi-mode large model can be searched out only by informing the multi-mode large model. The multi-modal large model can replace a real human secretary.
5. Writing codes remains a difficult skill for the average user to master. Developing a system application is also a time-consuming and labor-consuming project for professional programmers. The multi-mode large model has a certain code capability, and codes can be written according to natural language instructions of human beings. Therefore, the personal information system has the functions of writing out codes according to natural language demands sent by users and automatically deploying the codes into the environment for operation. For example, if a user wants to develop an application for managing date of birthday and reminding, the multimodal big model can write out python code and database sql statement according to the requirement, and deploy the python code and database sql statement into the environment to directly run for the user to use in the dialog box.
6. When a user has a goal to collect information, for example, the user wants to collect related news and interviews of basketball games, the user needs to use a search engine and various sports websites to search on the search engine, and the current search engine lacks intelligence, so that the user can only search for pages by himself. Under the support of a multi-mode large model, the system can actively receive and crawl related contents according to the requirements of users, and provide the contents to the users in time when the users need the contents.
7. Information produced by traditional websites and software applications is processed and consumed by people, and the information requirement is limited by the capabilities of people. Under the support of the multi-mode large model, the system screens information through the multi-mode large model and finally provides the information for users. The links of transmission, screening, arrangement and the like are borne by the multi-mode large model, the capability is greatly improved, so that the information transmission mode of the Internet is changed, a large amount of information production, transmission, screening, processing, arrangement and the like work under the support of the multi-mode large model, the capability of the Internet is greatly improved, and the information mode of the Internet is changed.
The foregoing is merely an embodiment of the present application, the present application is not limited to the field of this embodiment, and the specific structures and features well known in the schemes are not described in any way herein, so that those skilled in the art will know all the prior art in the field before the application date or priority date of the present application, and will have the capability of applying the conventional experimental means before the date, and those skilled in the art may, in light of the present application, complete and implement the present scheme in combination with their own capabilities, and some typical known structures or known methods should not be an obstacle for those skilled in the art to practice the present application. It should be noted that modifications and improvements can be made by those skilled in the art without departing from the structure of the present application, and these should also be considered as the scope of the present application, which does not affect the effect of the implementation of the present application and the utility of the patent. The protection scope of the present application is subject to the content of the claims, and the description of the specific embodiments and the like in the specification can be used for explaining the content of the claims.

Claims (10)

1. An intelligent informatization system based on an AI large language model is characterized by comprising an intelligent informatization subsystem, a global routing platform and a content providing module;
the global routing platform is used for providing information routing service for the intelligent informatization subsystem;
the content providing module is used for providing specified content;
the intelligent informatization subsystem comprises a registration account opening module, a large model module, a dialogue module, a content access module, an execution module, an information characteristic module, a content providing module, an information acquisition module and an information base;
the registration account opening module is used for providing a unique address on the Internet after the user registers for account opening;
the dialogue module is used for inputting dialogue content by a user;
the large model module comprises a multi-mode large model which is used for analyzing user instructions and user contents from dialogue contents; the method is also used for analyzing information required to be acquired by a user from the dialogue content and sorting the information into characteristic data;
the content access module is used for acquiring user instructions and transmitting the user instructions to the execution module for execution; the method is also used for acquiring characteristic data;
the execution module is used for executing corresponding operation according to the user instruction;
the information feature module is used for storing feature data;
the information acquisition module is used for acquiring content from an external website according to the characteristic data;
the information base is used for storing the content from the information acquisition module and the content providing module; and also for storing data from the execution module.
2. The AI-large language model-based intelligent informatization system of claim 1, wherein: the large model module is also for providing artificial intelligence capabilities including: conversations, code generation, function calls, ebadd generation, json generation, translation, lexical entity recognition, emotion classification, text classification, and abstract generation.
3. The AI-large language model-based intelligent informatization system of claim 2, wherein: the dialogue module comprises an interface of man-machine dialogue, and dialogue content comprises: text dialogs, files, tables, codes, graphics, and videos.
4. The AI-large language model-based intelligent informatization system of claim 3, wherein: the execution module comprises a file system, a database, a script/library environment and a plurality of interface programs;
the interface program comprises a software system and an application program which are deployed locally or at the cloud;
the content access module is also used for analyzing the multi-mode large model to obtain user content and storing the user content into a file system or a database; user content includes text, json strings, vectors, SQL, and program code;
the content access module is also used for calling the file system and the database to execute basic operation, wherein the basic operation comprises newly-built folders or newly-built database tables;
the script/library environment is used to deploy script code from multimodal big model generation.
5. The AI-large language model-based intelligent informatization system of claim 4, wherein: the information feature module is also used for providing feature data for visitors to access;
the characteristic data includes: and the natural language form adopts constraint conditions and vectors constructed by logic symbols.
6. The AI-large language model-based intelligent informatization system of claim 5, wherein: the global routing platform is used for externally disclosing the access address of the intelligent informatization subsystem after the intelligent informatization subsystem is registered, so that the content providing module can access and push the content;
the global routing platform is also used for matching the content provided by the content providing module with the characteristic data of the information required by the intelligent informatization subsystem.
7. The AI-large language model-based intelligent informatization system of claim 6, wherein: the global routing platform is also used for sorting all the contents from the content providing module, including sorting out content classification, content keywords and content abstracts;
and the content matching module is also used for performing content matching according to the characteristic data of the intelligent informatization subsystem, and if the matching is successful, transmitting the content to the registered address of the intelligent informatization subsystem.
8. The AI-large language model-based intelligent informatization system of claim 1, wherein: the address provided by the registration account opening module is an API address, and comprises an API address I, an API address II, an API address III and an API address IV;
the API address is used for informing the visitor of the basic situation of the intelligent informatization subsystem, wherein the basic situation comprises the purpose of the intelligent informatization subsystem and the function provided by the intelligent informatization subsystem;
the API address II is used for informing the information acquisition module and the content providing module of information required by the intelligent informatization subsystem;
the API address is used for receiving the content sent by the content providing module; the data format of the content includes TXT, JSON, HTML;
the API address four is used for an external visitor user or an external AI large model to initiate a dialogue with the intelligent informatization subsystem.
9. A method for constructing an intelligent informatization system based on an AI large language model, which is applied to the system of any one of claims 1 to 8, and is characterized by comprising the following contents:
s1, receiving registration account opening of a user through a registration account opening module, and providing a unique address on the Internet;
s2, receiving dialogue contents input by a user through a dialogue module;
s3, analyzing a user instruction and user content from the dialogue content through a multi-mode large model of the large model module; analyzing information required to be acquired by a user from the dialogue content, and sorting the information into characteristic data;
s4, acquiring a user instruction through the content access module, and delivering the user instruction to the execution module for execution; feature data is also acquired;
s5, the execution module executes corresponding operation according to the user instruction;
s6, storing the characteristic data through an information characteristic module;
s7, the information acquisition module acquires content from an external website according to the characteristic data;
s8, providing information routing service for the intelligent informatization subsystem through the global routing platform;
s9, the content providing module provides specified content for the global routing platform; the global routing platform performs content matching according to the characteristic data of the intelligent informatization subsystem, and if the matching is successful, the global routing platform sends the content to the registered address of the intelligent informatization subsystem;
s10, storing contents from the information acquisition module and the content providing module through an information base; and storing the data from the execution module.
10. The AI-large language model-based intelligent informatization system construction method according to claim 9, wherein: in step S3, the multimodal big model analyzes the user instruction which needs to be stored, queried or executed by the program in the dialogue content, converts the corresponding natural language in the dialogue content into the user instruction of the formatted language, sends the user instruction to the content access module, and informs the content access module of the execution module which needs to be called.
CN202311023967.7A 2023-08-15 2023-08-15 Intelligent informatization system based on AI large language model and construction method Pending CN117032643A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311023967.7A CN117032643A (en) 2023-08-15 2023-08-15 Intelligent informatization system based on AI large language model and construction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311023967.7A CN117032643A (en) 2023-08-15 2023-08-15 Intelligent informatization system based on AI large language model and construction method

Publications (1)

Publication Number Publication Date
CN117032643A true CN117032643A (en) 2023-11-10

Family

ID=88639700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311023967.7A Pending CN117032643A (en) 2023-08-15 2023-08-15 Intelligent informatization system based on AI large language model and construction method

Country Status (1)

Country Link
CN (1) CN117032643A (en)

Similar Documents

Publication Publication Date Title
US11556697B2 (en) Intelligent text annotation
KR101114023B1 (en) Content propagation for enhanced document retrieval
US7289985B2 (en) Enhanced document retrieval
US20020111934A1 (en) Question associated information storage and retrieval architecture using internet gidgets
US20140279622A1 (en) System and method for semantic processing of personalized social data and generating probability models of personal context to generate recommendations in searching applications
Hyvönen Semantic portals for cultural heritage
US20140149390A1 (en) Automatically Providing Relevant Search Results Based on User Behavior
US20110314382A1 (en) Systems of computerized agents and user-directed semantic networking
US20140114942A1 (en) Dynamic Pruning of a Search Index Based on Search Results
JP2003518664A (en) Method and system for constructing a personalized result set
US20100274770A1 (en) Transductive approach to category-specific record attribute extraction
JP2023507286A (en) Automatic creation of schema annotation files for converting natural language queries to structured query language
US9886480B2 (en) Managing credibility for a question answering system
JPWO2003060764A1 (en) Information retrieval system
Candela An automatic data quality approach to assess semantic data from cultural heritage institutions
Vehviläinen et al. A semi-automatic semantic annotation and authoring tool for a library help desk service
Fox Building and using digital libraries for ETDs
Hagood A brief introduction to data mining projects in the humanities
Lu et al. Language engineering for the Semantic Web: A digital library for endangered languages
Zhang et al. Complementary classification techniques based personalized software requirements retrieval with semantic ontology and user feedback
US8195458B2 (en) Open class noun classification
CN117032643A (en) Intelligent informatization system based on AI large language model and construction method
Niranjan et al. Higher Education Enrolment Query Chatbot Using Machine Learning
Serdyukov Search for expertise: going beyond direct evidence
Kumar et al. Web data mining using xML and agent framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination