CN117668002A - Big data decision method, device and equipment applied to public information platform - Google Patents

Big data decision method, device and equipment applied to public information platform Download PDF

Info

Publication number
CN117668002A
CN117668002A CN202410140961.6A CN202410140961A CN117668002A CN 117668002 A CN117668002 A CN 117668002A CN 202410140961 A CN202410140961 A CN 202410140961A CN 117668002 A CN117668002 A CN 117668002A
Authority
CN
China
Prior art keywords
query
public
big data
information
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410140961.6A
Other languages
Chinese (zh)
Other versions
CN117668002B (en
Inventor
余芳
余聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Heyi Cloud Data Technology Co ltd
Original Assignee
Jiangxi Heyi Cloud Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Heyi Cloud Data Technology Co ltd filed Critical Jiangxi Heyi Cloud Data Technology Co ltd
Priority to CN202410140961.6A priority Critical patent/CN117668002B/en
Priority claimed from CN202410140961.6A external-priority patent/CN117668002B/en
Publication of CN117668002A publication Critical patent/CN117668002A/en
Application granted granted Critical
Publication of CN117668002B publication Critical patent/CN117668002B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a big data decision method, a device and equipment applied to a public information platform, which comprises the steps of generating a data lottery request; starting a preset big data network model based on the data lottery request; carrying out a text feature convolution process on public query information through the big data network model; each query vector is subjected to a query process in the open-source internet through a big data network model, and a query result is converted into a result vector and is imported into a network structure; performing connection calibration with the query vector on the network structure based on the connection factor; searching the data on the Internet and judging whether the network structure reaches a preset load threshold value or not; if not, adopting a network structure which is completed by training, and correspondingly and independently replying public query information; the efficiency and the precision of processing public query information by the AI public information platform can be effectively improved, the network searching capability is enhanced, and the data screening is optimized, so that the processing capability and the user experience of the public information platform are obviously improved.

Description

Big data decision method, device and equipment applied to public information platform
Technical Field
The invention relates to the technical field of public electricity digital data processing, in particular to a big data decision method, a big data decision device and big data decision equipment applied to a public information platform.
Background
With the rapid development of big data and artificial intelligence technology, the application of the technology on various public information platforms is more and more widespread. These public information platforms are populated with a large number of user-generated public doubt information that is processed and utilized to provide more accurate and personalized services.
However, with the dramatic increase in the amount of data, AI public information platforms are not always able to efficiently process all public doubt information. In a conventional manner, when the AI public information platform cannot identify public doubt information, manual intervention is required, and the efficiency is low and the manpower is wasted. There are also drawbacks to the decision methods in the prior art. Such as processing of data requests for a lottery, and convolution processing of text features, often do not match exactly with public doubt.
Disclosure of Invention
The main purpose of the invention is to provide a big data decision method, a big data decision device and big data decision equipment applied to a public information platform, which can effectively improve the efficiency and the precision of processing public query information by an AI public information platform, strengthen the network searching capability and optimize the data screening, thereby obviously improving the processing capability and the user experience of the public information platform.
In order to achieve the above object, the present invention provides a big data decision method applied to a public information platform, comprising the following steps:
s1, generating a data lottery request, wherein the data lottery request carries public query information; the data lottery request is generated when the AI public information platform cannot identify public query information, and the AI public information platform is used for independently replying the public query information proposed by a user on a preset public information platform;
s2, starting a preset big data network model based on the data tone drawing request, and inputting the data tone drawing request into the big data network model; carrying out a text feature convolution process on public query information carried in the data lottery request through the big data network model to obtain query vectors matched with the public query information carried in the data lottery request, wherein the number of the query vectors is more than or equal to 1;
s3, carrying out a query process on each query vector in the open-source Internet through the big data network model, and converting query results obtained in one-to-one correspondence with each query vector into result vectors to be imported into a network structure of the big data network model; the network structure gathers query vectors and result vectors through a mesh plane structure;
S4, based on connection factors carried by connection lines on a network structure, carrying out connection calibration on the result vector and the query vector on the network structure, and screening out repeated and irrelevant impurity result vectors; the connection factors are corresponding relation factors when the query result is called in the Internet through the query vector;
s5, repeating the steps of S3-S4 to search the data on the Internet so as to optimize the connection factors on the network result and judge whether the network structure reaches a preset load threshold;
and S6, if not, adopting a network structure trained by the big data network model, and correspondingly and independently replying public query information carried in the data lottery request.
Further, before the step of generating the data-pumped request, the method includes:
and importing the big data network model into an AI public information platform.
Further, the step of performing a text feature convolution process on the public query information carried in the data lottery request through the big data network model to obtain a query vector matched with the public query information carried in the data lottery request comprises the following steps:
text encoding is carried out on public doubt information in a word embedding mode;
Performing feature extraction and convolution processes on the public query information after text encoding through a convolution neural network preset on a big data network model to obtain a query vector;
pooling the query vector;
if the number of the query vectors is greater than 1, splitting the pooled query vectors through softmax or SVM.
Further, the step of performing a query process on each query vector in the open-source internet through the big data network model, and converting the query results obtained in a one-to-one correspondence manner into result vectors to be imported into a network structure of the big data network model includes:
when the public query information is subjected to text coding, word segmentation is carried out on the public query information according to word senses to obtain query information with the number more than or equal to 1, and in the process of carrying out text coding on the query information, the query information is output to the Internet to carry out word segmentation query to obtain query results;
performing text coding on the query result, and performing feature extraction and convolution processes on the query result after text coding in a convolutional neural network to obtain a result vector;
the query vector and the result vector are input into a network structure and are associated by a connection factor of the network structure.
Further, the step of calibrating the connection between the result vector and the query vector on the network structure based on the connection factor carried by the connection line on the network structure comprises the following steps:
searching big data content in the Internet through the query information continuously to obtain a plurality of query results with corresponding connection factors with the query information;
and vectorizing a plurality of query results to obtain a plurality of corresponding result vectors, inputting the plurality of result vectors to a network structure, and associating the plurality of result vectors with one query result through each corresponding connection factor.
Further, the step of screening out duplicate, unrelated impurity result vectors, comprising:
and carrying out data optimization on the network structure by using a convolutional neural network based on the big data network model.
Further, the step of performing data optimization on the network structure by using the convolutional neural network based on the big data network model comprises the following steps:
identifying each node and connecting lines on the network structure, wherein the nodes comprise query vectors and result vectors, and the connecting lines comprise connecting factors;
step sliding is carried out on each node through the convolution kernel, multiply-accumulate operation is carried out on the data of the corresponding node position and the weight in the convolution kernel, and the nodes with the same similarity weight threshold value are deleted after the bias term is added;
And adopting RelU to nonlinear activate each node and each connecting wire, and carrying out downsampling so as to carry out corresponding data optimization on the nodes and the connecting wires on the network structure.
The invention also discloses a big data decision device applied to the public information platform, which comprises:
the request unit is used for generating a data lottery request, wherein the data lottery request carries public query information; the data lottery request is generated when the AI public information platform cannot identify public query information, and the AI public information platform is used for independently replying the public query information proposed by a user on a preset public information platform;
the model unit is used for starting a preset big data network model based on the data tone drawing request and inputting the data tone drawing request into the big data network model; carrying out a text feature convolution process on public query information carried in the data lottery request through the big data network model to obtain query vectors matched with the public query information carried in the data lottery request, wherein the number of the query vectors is more than or equal to 1;
the query unit is used for carrying out a query process on each query vector in the open-source internet through the big data network model, converting query results obtained in one-to-one correspondence with each query vector into result vectors, and importing the result vectors into a network structure of the big data network model; the network structure gathers query vectors and result vectors through a mesh plane structure;
The grid unit is used for carrying out connection calibration with the query vector on the network structure based on the connection factors carried by the connection lines on the network structure, and screening out repeated and irrelevant impurity result vectors; the connection factors are corresponding relation factors when the query result is called in the Internet through the query vector;
the optimizing unit is used for searching the data on the internet by repeating the steps of the cyclic query unit and the grid unit so as to optimize the connection factor on the network result and judge whether the network structure reaches a preset load threshold value or not;
and the output unit is used for adopting a network structure trained by the big data network model if not, and responding to public query information carried in the data lottery request.
The invention also discloses a computer device, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the big data decision method applied to the public information platform when executing the computer program.
The invention also discloses a computer readable storage medium, on which a computer program is stored, which when being executed by a processor implements the steps of the big data decision method applied to the public information platform.
Advantageous effects
And the problem processing efficiency is improved: the decision method automatically processes public query information which cannot be identified by the AI public information platform by using the big data network model and the convolutional neural network, so that manual intervention is reduced, and the problem processing efficiency is greatly improved.
Optimizing problem handling accuracy: the technical method comprises the steps of extracting text features by the convolutional neural network and convolving, so that public doubt information can be accurately matched, and the accuracy of problem processing is improved.
Network search capability is enhanced: the query vector is adopted to perform a query process in the open-source Internet, data on the Internet is searched maximally, and a result related to the query vector is obtained, so that the problem is solved by using more Internet resources.
Data screening and optimizing: the data optimization is carried out on the network structure through the convolutional neural network, and the method comprises the steps of identifying and optimizing each node and connecting line on the network structure, and screening out repeated and irrelevant impurity result vectors, so that public query information can be solved more accurately.
And the capability of an AI public information platform is improved: after training of the big data driving-based decision method, the AI public information platform can autonomously reply more types of public doubt information, so that the processing capacity of the public information platform is improved, and the user experience is also improved.
Drawings
FIG. 1 is a flow chart of a big data decision method applied to a public information platform according to an embodiment of the invention;
FIG. 2 is a block diagram of a big data decision device applied to a public information platform according to an embodiment of the present invention;
fig. 3 is a block diagram schematically illustrating a structure of a computer device according to an embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, a flow chart of a big data driving-based decision method according to the present invention includes the following steps:
s1, generating a data lottery request, wherein the data lottery request carries public query information; the data lottery request is generated when the AI public information platform cannot identify public query information, and the AI public information platform is used for independently replying the public query information proposed by a user on a preset public information platform;
S2, starting a preset big data network model based on the data tone drawing request, and inputting the data tone drawing request into the big data network model; carrying out a text feature convolution process on public query information carried in the data lottery request through the big data network model to obtain query vectors matched with the public query information carried in the data lottery request, wherein the number of the query vectors is more than or equal to 1;
s3, carrying out a query process on each query vector in the open-source Internet through the big data network model, and converting query results obtained in one-to-one correspondence with each query vector into result vectors to be imported into a network structure of the big data network model; the network structure gathers query vectors and result vectors through a mesh plane structure;
s4, based on connection factors carried by connection lines on a network structure, carrying out connection calibration on the result vector and the query vector on the network structure, and screening out repeated and irrelevant impurity result vectors; the connection factors are corresponding relation factors when the query result is called in the Internet through the query vector;
s5, repeating the steps of S3-S4 to search the data on the Internet so as to optimize the connection factors on the network result and judge whether the network structure reaches a preset load threshold;
And S6, if not, adopting a network structure trained by the big data network model, and correspondingly and independently replying public query information carried in the data lottery request.
In particular, the method comprises the steps of,
in S1, the AI public information platform is used to autonomously reply to various public doubt information presented by the user on the public information platform. For example, this may be an online question and answer platform on which users may raise their questions, the platform may attempt to automatically answer. However, in some cases, the AI community information platform may not recognize or understand the problem posed by the user. In particular, it may be because the subject matter to which the problem relates is beyond the knowledge of the platform, or the problem is addressed in a manner that is not familiar to the platform. In these cases, the AI platform generates a request called a "data-pumped request" and outputs it to a large data network model. The purpose of the data-pumped request is to seek assistance in order to better understand and answer the user's questions. This request may contain a variety of information, such as text of the question, other interactive information of the user, and may even contain data or questions similar to the question. The big data network model searches its knowledge base for information that helps solve the problem based on the received data-pumped requests. This network model then returns this information to the AI community information platform, enabling it to answer questions that were previously unanswered based on the newly acquired information.
The big data network model is an intelligent decision classification model (specifically a natural language model) preset at a terminal (computer equipment) corresponding to the AI public information platform, and the starting of the big data network model has triggering conditions as follows:
when the AI public information platform receives public query information and the text identifies the public query information, the corresponding reply text preset on the AI public information platform cannot be searched through the corresponding keywords in the public query information;
when this condition is triggered, a request for data lottery corresponding to the public doubt information is formed and the request is input to the big data network model.
In one embodiment, the natural language model includes NLP or GPT.
In S2, the big data network model receives the data lottery request and the carried public query information output by the AI public information platform. Such public doubt information may be various forms of text data such as words, sentences or paragraphs. The big data network model then performs a word-feature convolution process on these public doubt messages. Convolution is a common processing mode in the fields of natural language processing and the like, and can extract key features of data. In this scenario, the main role of the convolution process is to extract key semantic features in public doubt. Next, based on the extracted text features, the big data network model generates one or more query vectors. These query vectors are digitized data that contain semantic features of public query information that can be used to search in the knowledge base (internet platform) of the large data network model for information that helps solve the query. The number of query vectors is at least 1, and may be plural, depending on the complexity of the public query. For example, if a question contains multiple parts, multiple query vectors may be generated, each corresponding to a part of the question.
In S3, the public query information is feature convolved through the big data network model and one or more query vectors are obtained. Next, the big data network model will use these query vectors to query on the open source Internet. In this process, the query vector carries the semantic properties of public query information, and data similar to it is found in the internet. The results of this query are then morphologically transformed into a result vector. This result vector is digitized data that contains key information and features of the query result. The purpose of this process is to obtain a standardized and consistent data format so that we can process and analyze the data efficiently. These result vectors will then be imported into the network structure of the big data network model. The morphology of the network structure is typically a mesh plane structure that can efficiently organize and manage large amounts of data. In this structure, we can see clearly the source of the query (i.e., the query vector) and the results of the query (i.e., the result vector).
The query vector is a vector obtained by carrying out semantic convolution on public query information by the big data network model, the big data network model correspondingly inputs the query vector to the Internet by controlling the query vector to correspondingly search results corresponding to the query vector, the query vector is known to carry semantics (query keywords) corresponding to the public query information, the corresponding query result can be searched on the open-source Internet, and finally the big data network model convolves the query result to obtain the corresponding result vector to be recorded in the big data network model.
In addition, because the AI public information platform needs to be optimally trained, a network structure is arranged on the big data network model, the network structure is composed of a plurality of nodes and a plurality of connecting wires (such as a knowledge graph), and the nodes are classified into a first node and a second node; the first node is used for recording a query vector, and the second node is used for recording a result vector; after multiple times of training, a plurality of nodes are stored in the network structure and are used for being burnt in the AI public information platform by the big data network model so as to improve the learning capacity of the AI public information platform;
in S4, the query vector and its corresponding result vector are included by the network structure. In the network architecture, the query vector and the result vector are connected by connection lines, which carry an important piece of information, the so-called "connection factor". This connection factor effectively represents the relationship between the query vector and the result vector, which is automatically calculated by the system during the query. The relationship can reflect the matching degree or the relativity between the query vector and the query result when the query vector invokes the query result in the Internet. The system then calibrates the result vector on the network structure based on these connection factors. This scaling process effectively connects the result vector with the query vector to form a node in the network structure. It is thus clear which result vectors are related to the original query vector and the relationship between them.
When the large data network model performs optimization training on the AI public information platform, the second node (result vector) is required to be called according to the correspondence of the first node (carrying the query vector) so as to burn the query result corresponding to the result vector in the second node on the AI public information platform. Therefore, in the process of burning, the network structure is required to be ordered, so that the first node and the second node are associated through the connecting wire, and the associated burning from the question to the answer is realized.
Because the result vector is an internet search by the semantics of the public query information carried by the query vector as it is input to the big data network model, the result vector, after being input to the big data network model to form the second node, naturally converts the semantics of the carried public query information into connection lines for association for connecting the first node and the second node that are associated with each other.
Meanwhile, if the semantics of the public doubt information comprise a plurality of problems, a situation (such as a decision tree model) that a first node associates a plurality of related second nodes can be generated.
In one embodiment, among a plurality of questions included in the semantics of the public doubt information, the questions are strictly related, so that when the acquired second nodes (result vectors) are strictly related, the query results corresponding to the result vectors are integrated, and generation of impurity second nodes (impurity result vectors) is reduced, so that memory occupation of a network structure in a big data network model is optimized.
In S5, the network structure is optimized through the duration of S3-S4, so that a network structure model capable of completely replying public doubt information is obtained, and the network structure can divide the public doubt information into chains to derive more question answering decision schemes.
In S6, after the iterative optimization process described above, the load of the network structure does not exceed the preset threshold, that is, the network structure has been sufficiently trained and optimized, and then the big data network model outputs the trained network structure. The output is aimed at an AI public information platform. In other words, this trained and optimized network structure is connected back to the AI public information platform. The AI public information platform uses the network structure trained and optimized by the big data network model to further process, such as answering public questions.
In one embodiment, prior to the step of generating the data-pumped request, the method comprises:
and importing the big data network model into an AI public information platform. Wherein the big data network model is used for executing the following method:
acquiring public query information proposed by a user, and performing a text feature convolution process on the public query information through a big data network model to obtain a query vector matched with the public query information;
carrying out a query process in the open-source internet through the query vector, and converting the query result morphology obtained in one-to-one correspondence into a result vector to be imported into a network structure of the big data network model; the network structure gathers query vectors and result vectors through a mesh plane structure;
based on connection factors carried by connection lines on a network structure, carrying out connection calibration on the result vector and the query vector on the network structure, and screening out repeated and irrelevant impurity result vectors; the connection factors are corresponding relation factors when the query result is called in the Internet through the query vector;
searching data on the Internet to the maximum extent so as to optimize a connection factor on a network result and judging whether a network structure reaches a preset load threshold;
If yes, adopting a network structure to autonomously reply public query information.
In one embodiment, the step of performing a text feature convolution process on public query information carried in the data lottery request through the big data network model to obtain a query vector matched with the public query information carried in the data lottery request includes:
text encoding is carried out on public doubt information in a word embedding mode;
performing feature extraction and convolution processes on the public query information after text encoding through a convolution neural network preset on a big data network model to obtain a query vector;
pooling the query vector;
if the number of the query vectors is greater than 1, splitting the pooled query vectors through softmax or SVM.
In the specific implementation process, firstly, text encoding is carried out on public doubt information in a word embedding mode. Word Embedding (Word Embedding) is a technique that maps words to dense vectors, typically trained using deep learning. It can capture various subtle semantic attributes such as gender, complex number, verb tense, etc. A convolutional neural network (Convolutional Neural Network, CNN) preset on the big data network model is then used to extract and convolve the features of these text-encoded public query messages. This process may obtain key features of the text information to derive a query vector. Next, the query vector is "pooled," a dimension-reducing operation that reduces the vector dimension by sliding across the query vector through a Pooling window and taking its maximum or average value, while retaining critical information. Finally, if the number of query vectors is greater than 1, the pooled query vectors are split using a softmax or Support Vector Machine (SVM). Softmax is a special logic function used to deal with multiple classification problems. SVM is a commonly used classification algorithm. This results in a plurality of individual query vectors, each containing a portion of the features of the query.
Text feature convolution process for public doubt information: first, the public's query information is text-coded using word embedding techniques. Word embedding is a mapping that maps words or phrases from a vocabulary to a vector space. And then, carrying out feature extraction and convolution processing on the coded text by a preset convolution neural network to obtain a query vector. The purpose of this step is to convert the complex text information into a vector form that can be used for machine understanding and processing.
Processing and optimizing the query vector: the query vector is then subjected to a pooling operation. Pooling is a method of reducing the amount of convolutional layer data by selecting the maximum value from the feature matrix or calculating the average value. If the number of query vectors is greater than 1, the pooled query vectors are split by softmax or SVM. The Softmax function may "compress" the value of one K-dimensional vector z containing any real number into another K-dimensional real vector σ (z) such that each element ranges between (0, 1) and the sum of all elements is 1. Support Vector Machines (SVMs) are a commonly used model for classification or regression.
Query over the internet: and inquiring in the open Internet through the inquiry vector, and converting the obtained inquiry result into a result vector. In this process, the query vector acts as a semantic carrier, finding matching content in the internet. The resulting vector of this step not only retains the characteristics of the original query, but also has many enhanced characteristics.
Processing of the result vector: the result vectors are imported into a network structure of a big data network model, then the result vectors are connected with the query vector according to connecting lines on the network structure and connecting factors carried by the connecting lines, and repeated and irrelevant impurity result vectors are screened out.
Iterative optimization process: by repeating the steps, the data on the Internet is searched to the maximum extent, the connection factors on the network result are optimized, and meanwhile, whether the network structure reaches a preset load threshold value is judged, so that the stability and the efficiency of the system are ensured, and the accuracy of information is improved.
And (3) outputting: and outputting the trained network structure to an AI public information platform to correspond to autonomous reply of public doubt information once the model training is completed and the network structure does not exceed a preset load threshold.
In one embodiment, the step of performing a query process on each query vector in the open-source internet through the big data network model, and converting the query result obtained in a one-to-one correspondence into a result vector for importing into a network structure of the big data network model includes:
when the public query information is subjected to text coding, word segmentation is carried out on the public query information according to word senses to obtain query information with the number more than or equal to 1, and in the process of carrying out text coding on the query information, the query information is output to the Internet to carry out word segmentation query to obtain query results;
Performing text coding on the query result, and performing feature extraction and convolution processes on the query result after text coding in a convolutional neural network to obtain a result vector;
the query vector and the result vector are input into a network structure and are associated by a connection factor of the network structure.
In the specific implementation process, after the public query information is obtained, word segmentation processing is firstly carried out, and the problem is segmented according to word senses, so that a series of query information is obtained. Assuming that a sentence contains multiple words, each word may be independent query information. And carrying out text coding on the query information subjected to word segmentation in a word embedding mode, and then outputting the coded query information to the Internet for query. This query process is equivalent to finding data or information matching the query information in the open-source internet. The results returned by the query are also subjected to text coding, and feature extraction and convolution processing are performed by using a convolution neural network, so that the results of the query are converted into result vectors which can be understood and processed. Next, the query vector and the result vector are input into the network structure, and the query vector and the corresponding result vector are associated by operating on the connection factors of the network results.
In one embodiment, the step of calibrating the connection between the result vector and the query vector on the network structure based on the connection factor carried by the connection line on the network structure includes:
searching big data content in the Internet through the query information continuously to obtain a plurality of query results with corresponding connection factors with the query information;
and vectorizing a plurality of query results to obtain a plurality of corresponding result vectors, inputting the plurality of result vectors to a network structure, and associating the plurality of result vectors with one query result through each corresponding connection factor.
In the specific implementation process, continuous big data search is carried out in the Internet through query information. This step is likely to refer to the continuous retrieval of data or information relevant to the query information via crawler technology or API calls and the resulting set of query results. This query result is obtained by matching with the connection factors already embedded in the query information. The set of query results is then vectorized, i.e., the query results are converted into result vectors using a particular encoding technique, such as word embedding. Once the result vectors are obtained, they may be input onto the network structure, associated with the query vector by a connection factor. The association herein may refer to adjusting the connection factors in the network structure to optimize the association between the query vector and the result vector based on a learning algorithm, such as a back-propagation algorithm or a reinforcement learning algorithm.
In one embodiment, the step of screening out duplicate, unrelated impurity result vectors comprises:
and carrying out data optimization on the network structure by using a convolutional neural network based on the big data network model.
Specifically, the step of performing data optimization on the network structure by using the convolutional neural network based on the big data network model comprises the following steps:
identifying each node and connecting lines on the network structure, wherein the nodes comprise query vectors and result vectors, and the connecting lines comprise connecting factors;
step sliding is carried out on each node through the convolution kernel, multiply-accumulate operation is carried out on the data of the corresponding node position and the weight in the convolution kernel, and the nodes with the same similarity weight threshold value are deleted after the bias term is added;
and adopting RelU to nonlinear activate each node and each connecting wire, and carrying out downsampling so as to carry out corresponding data optimization on the nodes and the connecting wires on the network structure.
Identifying each node and connecting line on the network structure: the nodes here represent query vectors and result vectors, which two different types of vectors further constitute a network structure, which are related to each other by connection factors, i.e. connection lines. The first step is therefore to identify these basic elements that make up the network.
Sliding is performed using convolution kernels: next, convolutional neural networks will operate on these nodes. The specific convolution kernel slides on each node according to a certain step length, and the data of the corresponding node is accumulated after multiplication operation is carried out on the weight in the convolution kernel, which is the basic step of the convolution operation. After adding a bias term, nodes with the same similarity weight threshold value can be deleted, and the step mainly consists in eliminating repeated or very similar query results, so as to avoid redundancy of the same information.
RelU nonlinear activation and downsampling: this step will activate all nodes and connections, using Rectified Linear Units (ReLU) as the activation function. ReLU is a commonly used nonlinear activation function that can set all negative values to zero, hold and pass positive values, which can solve some problems in the neural network training process, such as gradient vanishing. Downsampling, i.e., reducing the input dimensions, is then performed, which helps to increase the computational efficiency of the model and reduces the over-fitting phenomenon of the model.
Referring to fig. 2, a block diagram of a big data driving-based decision device according to the present invention is provided, where the device includes:
The request unit 1 is used for generating a data lottery request, wherein the data lottery request carries public query information; the data lottery request is generated when the AI public information platform cannot identify public query information, and the AI public information platform is used for independently replying the public query information proposed by a user on a preset public information platform;
a model unit 2, configured to start a preset big data network model based on the data pitch request, and input the data pitch request to the big data network model; carrying out a text feature convolution process on public query information carried in the data lottery request through the big data network model to obtain query vectors matched with the public query information carried in the data lottery request, wherein the number of the query vectors is more than or equal to 1;
the query unit 3 is used for carrying out a query process on each query vector in the open-source internet through the big data network model, and converting query results obtained in one-to-one correspondence with each query vector into result vectors which are imported into a network structure of the big data network model; the network structure gathers query vectors and result vectors through a mesh plane structure;
The grid unit 4 is used for carrying out connection calibration with the query vector on the network structure based on the connection factors carried by the connection lines on the network structure, and screening out repeated and irrelevant impurity result vectors; the connection factors are corresponding relation factors when the query result is called in the Internet through the query vector;
the optimizing unit 5 is used for repeating the steps of the cyclic query unit and the grid unit, searching the data on the internet to the maximum extent, optimizing the connection factor on the network result, and judging whether the network structure reaches a preset load threshold;
and the output unit 6 is used for adopting a network structure trained by the big data network model if not, and responding to public query information carried in the data lottery request.
In this embodiment, for specific implementation of each unit in the above embodiment of the apparatus, please refer to the description in the above embodiment of the method, and no further description is given here.
Referring to fig. 3, in an embodiment of the present invention, there is further provided a computer device, which may be a server, and an internal structure thereof may be as shown in fig. 3. The computer device includes a processor, a memory, a display screen, an input device, a network interface, and a database connected by a system bus. Wherein the computer is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used to store the corresponding data in this embodiment. The network interface of the computer device is used for communicating with an external terminal through a network connection. Which computer program, when being executed by a processor, carries out the above-mentioned method.
It will be appreciated by those skilled in the art that the architecture shown in fig. 3 is merely a block diagram of a portion of the architecture in connection with the present inventive arrangements and is not intended to limit the computer devices to which the present inventive arrangements are applicable.
An embodiment of the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above method. It is understood that the computer readable storage medium in this embodiment may be a volatile readable storage medium or a nonvolatile readable storage medium.
In summary, a data lottery request is generated, wherein the data lottery request carries public query information; the data lottery request is generated when the AI public information platform cannot identify public query information, and the AI public information platform is used for independently replying the public query information proposed by a user on a preset public information platform; starting a preset big data network model based on the data tone extraction request, and inputting the data tone extraction request into the big data network model; carrying out a text feature convolution process on public query information carried in the data lottery request through the big data network model to obtain query vectors matched with the public query information carried in the data lottery request, wherein the number of the query vectors is more than or equal to 1; carrying out a query process on each query vector in the open-source internet through the big data network model, and converting query results obtained in one-to-one correspondence with each query vector into result vectors which are imported into a network structure of the big data network model; the network structure gathers query vectors and result vectors through a mesh plane structure; based on connection factors carried by connection lines on a network structure, carrying out connection calibration on the result vector and the query vector on the network structure, and screening out repeated and irrelevant impurity result vectors; the connection factors are corresponding relation factors when the query result is called in the Internet through the query vector; searching the data on the Internet to optimize the connection factor on the network result and judging whether the network structure reaches a preset load threshold; if not, adopting a network structure trained by the big data network model, and responding to public query information carried in the data lottery request; the efficiency and the precision of processing public query information by the AI public information platform can be effectively improved, the network searching capability is enhanced, and the data screening is optimized, so that the processing capability and the user experience of the public information platform are obviously improved.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided by the present invention and used in embodiments may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM, among others.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structures or equivalent processes using the descriptions and drawings of the present invention or direct or indirect application in other related technical fields are included in the scope of the present invention.

Claims (10)

1. The big data decision method applied to the public information platform is characterized by comprising the following steps:
s1, generating a data lottery request, wherein the data lottery request carries public query information; the data lottery request is generated when the AI public information platform cannot identify public query information, and the AI public information platform is used for independently replying the public query information proposed by a user on a preset public information platform;
S2, starting a preset big data network model based on the data tone drawing request, and inputting the data tone drawing request into the big data network model; carrying out a text feature convolution process on public query information carried in the data lottery request through the big data network model to obtain query vectors matched with the public query information carried in the data lottery request, wherein the number of the query vectors is more than or equal to 1;
s3, carrying out a query process on each query vector in the open-source Internet through the big data network model, and converting query results obtained in one-to-one correspondence with each query vector into result vectors to be imported into a network structure of the big data network model; the network structure gathers query vectors and result vectors through a mesh plane structure;
s4, based on connection factors carried by connection lines on a network structure, carrying out connection calibration on the result vector and the query vector on the network structure, and screening out repeated and irrelevant impurity result vectors; the connection factors are corresponding relation factors when the query result is called in the Internet through the query vector;
s5, repeating the steps of S3-S4 to search the data on the Internet so as to optimize the connection factors on the network result and judge whether the network structure reaches a preset load threshold;
And S6, if not, adopting a network structure trained by the big data network model, and correspondingly and independently replying public query information carried in the data lottery request.
2. The big data decision method applied to a public information platform according to claim 1, wherein before the step of generating the data pitch request, comprising:
and importing the big data network model into an AI public information platform.
3. The big data decision method applied to a public information platform according to claim 1, wherein the step of performing a text feature convolution process on public doubt information carried in the data lottery request through the big data network model to obtain a query vector matched with the public doubt information carried in the data lottery request comprises the following steps:
text encoding is carried out on public doubt information in a word embedding mode;
performing feature extraction and convolution processes on the public query information after text encoding through a convolution neural network preset on a big data network model to obtain a query vector;
pooling the query vector;
if the number of the query vectors is greater than 1, splitting the pooled query vectors through softmax or SVM.
4. The big data decision method applied to the public information platform according to claim 1, wherein the step of performing a query process on each query vector in the open-source internet through the big data network model and converting the query result morphology obtained by one-to-one correspondence into a result vector to be imported into a network structure of the big data network model comprises the following steps:
when the public query information is subjected to text coding, word segmentation is carried out on the public query information according to word senses to obtain query information with the number more than or equal to 1, and in the process of carrying out text coding on the query information, the query information is output to the Internet to carry out word segmentation query to obtain query results;
performing text coding on the query result, and performing feature extraction and convolution processes on the query result after text coding in a convolutional neural network to obtain a result vector;
the query vector and the result vector are input into a network structure and are associated by a connection factor of the network structure.
5. The big data decision method applied to a public information platform according to claim 1, wherein the step of calibrating the connection between the result vector and the query vector on the network structure based on the connection factors carried by the connection lines on the network structure comprises the steps of:
Searching big data content in the Internet through the query information continuously to obtain a plurality of query results with corresponding connection factors with the query information;
and vectorizing a plurality of query results to obtain a plurality of corresponding result vectors, inputting the plurality of result vectors to a network structure, and associating the plurality of result vectors with one query result through each corresponding connection factor.
6. The big data decision method for public information platforms of claim 1, wherein the step of screening out duplicate, unrelated impurity result vectors comprises:
and carrying out data optimization on the network structure by using a convolutional neural network based on the big data network model.
7. The big data decision method applied to a public information platform according to claim 6, wherein the step of performing data optimization on the network structure based on the convolutional neural network of the big data network model comprises the steps of:
identifying each node and connecting lines on the network structure, wherein the nodes comprise query vectors and result vectors, and the connecting lines comprise connecting factors;
step sliding is carried out on each node through the convolution kernel, multiply-accumulate operation is carried out on the data of the corresponding node position and the weight in the convolution kernel, and the nodes with the same similarity weight threshold value are deleted after the bias term is added;
And adopting RelU to nonlinear activate each node and each connecting wire, and carrying out downsampling so as to carry out corresponding data optimization on the nodes and the connecting wires on the network structure.
8. Big data decision device applied to public information platform, characterized by comprising:
the request unit is used for generating a data lottery request, wherein the data lottery request carries public query information; the data lottery request is generated when the AI public information platform cannot identify public query information, and the AI public information platform is used for independently replying the public query information proposed by a user on a preset public information platform;
the model unit is used for starting a preset big data network model based on the data tone drawing request and inputting the data tone drawing request into the big data network model; carrying out a text feature convolution process on public query information carried in the data lottery request through the big data network model to obtain query vectors matched with the public query information carried in the data lottery request, wherein the number of the query vectors is more than or equal to 1;
the query unit is used for carrying out a query process on each query vector in the open-source internet through the big data network model, converting query results obtained in one-to-one correspondence with each query vector into result vectors, and importing the result vectors into a network structure of the big data network model; the network structure gathers query vectors and result vectors through a mesh plane structure;
The grid unit is used for carrying out connection calibration with the query vector on the network structure based on the connection factors carried by the connection lines on the network structure, and screening out repeated and irrelevant impurity result vectors; the connection factors are corresponding relation factors when the query result is called in the Internet through the query vector;
the optimizing unit is used for searching the data on the internet by repeating the steps of the cyclic query unit and the grid unit so as to optimize the connection factor on the network result and judge whether the network structure reaches a preset load threshold value or not;
and the output unit is used for adopting a network structure trained by the big data network model if not, and responding to public query information carried in the data lottery request.
9. A computer device comprising a memory and a processor, the memory having stored therein a computer program, characterized in that the processor, when executing the computer program, implements the steps of the big data decision method of any of claims 1 to 7 applied to a public information platform.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the big data decision method applied to a public information platform as claimed in any of claims 1 to 7.
CN202410140961.6A 2024-02-01 Big data decision method, device and equipment applied to public information platform Active CN117668002B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410140961.6A CN117668002B (en) 2024-02-01 Big data decision method, device and equipment applied to public information platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410140961.6A CN117668002B (en) 2024-02-01 Big data decision method, device and equipment applied to public information platform

Publications (2)

Publication Number Publication Date
CN117668002A true CN117668002A (en) 2024-03-08
CN117668002B CN117668002B (en) 2024-05-17

Family

ID=

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070094268A1 (en) * 2005-10-21 2007-04-26 Tabe Joseph A Broadband centralized transportation communication vehicle for extracting transportation topics of information and monitoring terrorist data
CN106327157A (en) * 2016-08-23 2017-01-11 黄毅 Online government service system and use method thereof
CN107577737A (en) * 2017-08-25 2018-01-12 北京百度网讯科技有限公司 Method and apparatus for pushed information
WO2021051517A1 (en) * 2019-09-19 2021-03-25 平安科技(深圳)有限公司 Information retrieval method based on convolutional neural network, and device related thereto
CN112632256A (en) * 2020-12-29 2021-04-09 平安科技(深圳)有限公司 Information query method and device based on question-answering system, computer equipment and medium
CN114817622A (en) * 2021-12-08 2022-07-29 广州酷狗计算机科技有限公司 Song fragment searching method and device, equipment, medium and product thereof
WO2022240906A1 (en) * 2021-05-11 2022-11-17 Strong Force Vcn Portfolio 2019, Llc Systems, methods, kits, and apparatuses for edge-distributed storage and querying in value chain networks
CN116415203A (en) * 2023-03-17 2023-07-11 北京无代码科技有限公司 Government information intelligent fusion system and method based on big data
CN116610218A (en) * 2023-06-12 2023-08-18 世优(北京)科技有限公司 AI digital person interaction method, device and system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070094268A1 (en) * 2005-10-21 2007-04-26 Tabe Joseph A Broadband centralized transportation communication vehicle for extracting transportation topics of information and monitoring terrorist data
CN106327157A (en) * 2016-08-23 2017-01-11 黄毅 Online government service system and use method thereof
CN107577737A (en) * 2017-08-25 2018-01-12 北京百度网讯科技有限公司 Method and apparatus for pushed information
WO2021051517A1 (en) * 2019-09-19 2021-03-25 平安科技(深圳)有限公司 Information retrieval method based on convolutional neural network, and device related thereto
CN112632256A (en) * 2020-12-29 2021-04-09 平安科技(深圳)有限公司 Information query method and device based on question-answering system, computer equipment and medium
WO2022240906A1 (en) * 2021-05-11 2022-11-17 Strong Force Vcn Portfolio 2019, Llc Systems, methods, kits, and apparatuses for edge-distributed storage and querying in value chain networks
CN114817622A (en) * 2021-12-08 2022-07-29 广州酷狗计算机科技有限公司 Song fragment searching method and device, equipment, medium and product thereof
CN116415203A (en) * 2023-03-17 2023-07-11 北京无代码科技有限公司 Government information intelligent fusion system and method based on big data
CN116610218A (en) * 2023-06-12 2023-08-18 世优(北京)科技有限公司 AI digital person interaction method, device and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
廖祥文;谢媛媛;魏晶晶;桂林;程学旗;陈国龙;: "基于卷积记忆网络的视角级微博情感分类", 模式识别与人工智能, no. 03, 15 March 2018 (2018-03-15) *
张芳芳;曹兴超;: "基于字面和语义相关性匹配的智能篇章排序", 山东大学学报(理学版), no. 03, 7 March 2018 (2018-03-07) *
李昀;邓颖;吴华瑞;: "面向农业科研办公的垂直搜索引擎研究与设计", 西南师范大学学报(自然科学版), no. 09, 20 September 2020 (2020-09-20) *

Similar Documents

Publication Publication Date Title
CN108694225B (en) Image searching method, feature vector generating method and device and electronic equipment
CN110598206A (en) Text semantic recognition method and device, computer equipment and storage medium
CN110955761A (en) Method and device for acquiring question and answer data in document, computer equipment and storage medium
CN111191032B (en) Corpus expansion method, corpus expansion device, computer equipment and storage medium
CN110837738B (en) Method, device, computer equipment and storage medium for identifying similarity
CN113934830A (en) Text retrieval model training, question and answer retrieval method, device, equipment and medium
CN111325030A (en) Text label construction method and device, computer equipment and storage medium
CN111858854A (en) Question-answer matching method based on historical dialogue information and related device
CN112070550A (en) Keyword determination method, device and equipment based on search platform and storage medium
CN112100377A (en) Text classification method and device, computer equipment and storage medium
CN112632258A (en) Text data processing method and device, computer equipment and storage medium
CN113761868A (en) Text processing method and device, electronic equipment and readable storage medium
CN115495553A (en) Query text ordering method and device, computer equipment and storage medium
CN112667780A (en) Comment information generation method and device, electronic equipment and storage medium
CN110516240B (en) Semantic similarity calculation model DSSM (direct sequence spread spectrum) technology based on Transformer
CN111859916A (en) Ancient poetry keyword extraction and poetry sentence generation method, device, equipment and medium
CN109086386B (en) Data processing method, device, computer equipment and storage medium
CN112434533B (en) Entity disambiguation method, entity disambiguation device, electronic device, and computer-readable storage medium
CN111400340B (en) Natural language processing method, device, computer equipment and storage medium
CN117668002B (en) Big data decision method, device and equipment applied to public information platform
CN112765976A (en) Text similarity calculation method, device and equipment and storage medium
CN117668002A (en) Big data decision method, device and equipment applied to public information platform
CN114266255B (en) Corpus classification method, apparatus, device and storage medium based on clustering model
CN115169342A (en) Text similarity calculation method and device, electronic equipment and storage medium
WO2023173547A1 (en) Text image matching method and apparatus, device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant