CN108509424B

CN108509424B - System information processing method, apparatus, computer device and storage medium

Info

Publication number: CN108509424B
Application number: CN201810313040.XA
Authority: CN
Inventors: 韩梅; 张安元; 邓华威; 王科
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-04-09
Filing date: 2018-04-09
Publication date: 2021-08-10
Anticipated expiration: 2038-04-09
Also published as: CN108509424A; WO2019196228A1

Abstract

The application relates to a method and a device for processing manufacturing information, computer equipment and a storage medium. The method comprises the following steps: monitoring system information issued by a terminal, and segmenting the system information to obtain a corresponding original word set; the original word set comprises a plurality of original words; synonymy expanding is carried out on each original word, and an expanded word set corresponding to each original word is generated; forming an expansion system information set corresponding to the system information according to each expansion word set; inputting the extended system information set into a preset system management model to obtain a target type corresponding to the system information; obtaining category labels corresponding to the target information trees respectively, screening the target information trees containing the category labels corresponding to the target categories, and adding system information to the screened target information trees. By adopting the method, the system information classification efficiency and accuracy can be improved.

Description

System information processing method, apparatus, computer device and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for processing scheduling information, a computer device, and a storage medium.

Background

The enterprise standardization is to unify repetitive things and concepts in the activities such as enterprise production, management and management by making, releasing and implementing system specifications so as to improve the enterprise management level. The system specification (hereinafter referred to as "system") is the stipulation and criterion that employees must commonly follow in production and operation activities, and includes the specification documents of laws and policies, enterprise organization structure, management system, post responsibility, technical standard, workflow, etc. In order to meet the working requirements of each post, enterprises need to classify and manage systems from different dimensions, and construct a plurality of different information trees, such as technical standard information trees, legal policy information trees and the like, so that different systems with different types and purposes form different system systems. When a new schedule is released, the newly released schedule needs to be incorporated into a corresponding information tree. The same system may belong to a plurality of different information trees simultaneously. Along with the increase of the scale of enterprises, corresponding system information and information trees are more and more. In a traditional mode, a large amount of system information is classified and managed manually, so that the efficiency is low, and errors are easy to make.

Disclosure of Invention

In view of the above, it is necessary to provide a system information processing method, apparatus, computer device, and storage medium capable of improving system information classification efficiency and accuracy.

A method of process scheduling information, the method comprising: monitoring system information issued by a terminal, and segmenting the system information to obtain a corresponding original word set; the original set of terms comprises a plurality of original terms; synonymy expanding is carried out on each original word, and an expanded word set corresponding to each original word is generated; forming an extended system information set corresponding to the system information according to each extended word set; inputting the extended system information set into a preset system management model to obtain a target type corresponding to the system information; obtaining category labels corresponding to a plurality of target information trees respectively, screening the target information trees containing the category labels corresponding to the target categories, and adding the system information to the screened target information trees.

In one embodiment, the system information comprises system description information; before performing word segmentation on the system information to obtain a corresponding original word set, the method further comprises the following steps: detecting whether the system description information contains category information or not; if yes, adding the system information to a corresponding target information tree according to the category information; otherwise, performing word segmentation on the system information to obtain a corresponding original word set.

In one embodiment, the generating step of the institutional management model comprises: acquiring training sample data; the training sample data comprises a plurality of sample system information and respectively corresponding category labels; performing word segmentation and synonymous expansion processing on each sample system information to obtain an expanded sample system information set corresponding to each sample system information; and training an initial system management model through a support vector machine algorithm according to each extended sample system information set and the corresponding category labels to obtain the system management model.

In one embodiment, the extended sample system information set comprises a plurality of groups of extended sample system information; according to each extended sample system information set and the corresponding category label, training an initial system management model through a support vector machine algorithm comprises the following steps: acquiring a characteristic item, and calculating the word frequency weight of the characteristic item in a group of extended sample system information; calculating the document frequency of the feature items in the whole training sample data; calculating the characteristic weight corresponding to the characteristic item according to the word frequency weight and the document frequency; selecting the characteristic item as a characteristic word of corresponding extended sample system information according to the characteristic weight; and extracting the characteristics of the standard information of each extended sample according to the characteristic words.

In one embodiment, the system information comprises system description information and system files; adding the system information to the target information tree obtained by screening comprises the following steps: generating an information node according to the system description information; detecting whether the same information nodes exist in the target information tree obtained by screening; and if the system file does not exist, adding the information node to the corresponding target information tree, and associating the system file to the information node.

In one embodiment, the system information comprises system description information and an associated system file; the system file comprises a plurality of system clauses and applicable object identifications respectively corresponding to the system clauses; the associated information tree has corresponding applicable object identification; the method further comprises the following steps: splitting the system file, and generating system subfiles corresponding to the corresponding applicable object identifications by using the system clauses corresponding to each applicable object identification; acquiring a plurality of associated information trees corresponding to the target information tree; and adding the system description information and the system subfiles to corresponding associated information trees according to the applicable object identifiers.

In one embodiment, the splitting the system file comprises: calculating the data volume of the system file, and detecting whether the data volume exceeds a threshold value; when the data volume exceeds a threshold value, acquiring a preset target data volume, and determining the splitting position of the system file according to the target data volume; detecting whether the split location is located between adjacent separators; when the splitting position is located at a separator, splitting the system file into a plurality of intermediate files at the splitting position; when the splitting position is located between adjacent separators, splitting the system file into a plurality of intermediate files at any one of the adjacent separators; and splitting the plurality of intermediate files according to a preset splitting rule.

An apparatus for processing production information, the apparatus comprising:

the system comprises an information expansion module, a word segmentation module and a word segmentation module, wherein the information expansion module is used for monitoring system information issued by a terminal and segmenting the system information to obtain a corresponding original word set; the original set of terms comprises a plurality of original terms; synonymy expanding is carried out on each original word, and an expanded word set corresponding to each original word is generated; forming an extended system information set corresponding to the system information according to each extended word set;

the information classification module is used for inputting the extended system information set into a preset system management model to obtain a target type corresponding to the system information;

and the information filing module is used for acquiring the class labels corresponding to the target information trees respectively, screening the target information trees containing the class labels corresponding to the target classes, and adding the system information to the screened target information trees.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program: monitoring system information issued by a terminal, and segmenting the system information to obtain a corresponding original word set; the original set of terms comprises a plurality of original terms; synonymy expanding is carried out on each original word, and an expanded word set corresponding to each original word is generated; forming an extended system information set corresponding to the system information according to each extended word set; inputting the extended system information set into a preset system management model to obtain a target type corresponding to the system information; obtaining category labels corresponding to a plurality of target information trees respectively, screening the target information trees containing the category labels corresponding to the target categories, and adding the system information to the screened target information trees.

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of: monitoring system information issued by a terminal, and segmenting the system information to obtain a corresponding original word set; the original set of terms comprises a plurality of original terms; synonymy expanding is carried out on each original word, and an expanded word set corresponding to each original word is generated; forming an extended system information set corresponding to the system information according to each extended word set; inputting the extended system information set into a preset system management model to obtain a target type corresponding to the system information; obtaining category labels corresponding to a plurality of target information trees respectively, screening the target information trees containing the category labels corresponding to the target categories, and adding the system information to the screened target information trees.

According to the system information processing method, the device, the computer equipment and the storage medium, the system information is segmented by monitoring the newly issued system information to obtain a corresponding original word set; by acquiring synonyms corresponding to all original words in the original word set, an expanded word set can be formed by utilizing the original words and the corresponding synonyms; according to the expansion word set corresponding to each original word, an expansion system information set corresponding to the system information can be formed; inputting the extended system information set into a trained system management model to obtain a target type corresponding to system information; and matching the target type with the type labels respectively corresponding to a plurality of pre-stored target information trees to obtain the target information trees capable of containing the system information by screening, and adding the system information to the target information trees obtained by screening. An expansion word set corresponding to each original word is formed first, and then an expansion system information set is formed through the expansion word set, so that the expansion degree of expansion system information is greatly improved, each expanded system information expresses the meaning which is the same as or similar to the system information, and the effective coverage range of the system information is improved, therefore, after a trained system management model is subsequently input, the accuracy of target categories can be improved, the system information can be accurately brought into a corresponding target information tree, and the system information classification efficiency and accuracy are improved.

Drawings

FIG. 1 is a diagram of an exemplary system information processing method;

FIG. 2 is a schematic flow chart diagram of a system information processing method according to an embodiment;

FIG. 3 is a diagram illustrating a target information tree in a system information processing method according to an embodiment;

FIG. 4 is a flowchart illustrating the steps of constructing a tree of association information in one embodiment;

FIG. 5 is a diagram illustrating an associated information tree in a system information processing method according to an embodiment;

FIG. 6 is a block diagram showing the construction of an system information processing apparatus according to an embodiment;

FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The system information processing method provided by the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 and the server 104 communicate via a network. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.

A variety of target information trees are stored in server 104. Each target information tree has a corresponding category label. The server 104 monitors whether the terminal 102 issues new system information, and when it is monitored that the terminal 102 issues new system information, the server 104 classifies the system information and incorporates the system information into a corresponding target information tree. Specifically, the server 104 performs word segmentation on the system information to obtain an original word set including a plurality of original words. The server 104 obtains the synonyms corresponding to the original words, and forms an expanded word set by the original words and the corresponding synonyms. There is a corresponding set of expanded terms for each original term. The server 104 randomly selects a word from the expansion word set corresponding to each original word according to the appearance sequence of each original word in the system information, and forms an expansion system information according to the sequence. When different words are selected from the expansion word set, different expansion system information is formed, and the expansion system information set is formed by the different expansion system information. The server 104 inputs the extended system information set into the trained system management model, and determines the target category corresponding to the system information by using the system management model. The server 104 acquires a class label corresponding to the target class, screens an information node including the acquired class label, and adds system information to the target information tree obtained by the screening. The system information is subjected to word segmentation and synonymous expansion, and the effective coverage range of the system information is improved, so that the accuracy of the target category can be improved after the system information is subsequently input into a system management model, the system information can be accurately incorporated into a corresponding target information tree, and the system information classification efficiency and accuracy are improved.

In one embodiment, as shown in fig. 2, a method for processing scheduling information is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:

step 202, monitoring system information issued by a terminal, and segmenting the system information to obtain a corresponding original word set; the original set of words includes a plurality of original words.

And the server monitors whether the first terminal issues new system information or not. The system information comprises system description information and associated system files. The system description information comprises system codes, system names, system levels, release units, release dates, applicable object identifiers or information abstracts and the like. The system information may be text information, voice information, image information, video information, or the like. If the information is voice information, image information or video information, the voice information, the image information or the video information can be converted into text information through voice recognition or image processing. The system file comprises a plurality of system clauses and applicable object identifications corresponding to each system clause. The applicable object identifier is identifier information of an object which needs to execute or understand the system, and can be a post identifier, an organization identifier and the like.

And when monitoring that the first terminal issues new system information, the server classifies the system information. Specifically, the server performs word segmentation on system information through a word segmentation algorithm to obtain an original word set. The original set of words includes a plurality of original words. In one embodiment, after each original word is obtained, words with small influence on classification, such as stop words, tone words, punctuation marks and the like, are removed, so that the efficiency of subsequent feature extraction is improved. Stop words refer to words in the system information that occur more frequently than a preset threshold but are of little practical significance, e.g., my, him, etc.

In one embodiment, before segmenting the institutional information to obtain the corresponding original word set, the method further comprises: detecting whether system description information contains category information or not; if yes, adding system information to a corresponding target information tree according to the category information; otherwise, performing word segmentation on the system information to obtain a corresponding original word set.

When the terminal issues system information, the type information of the system information can be pre-marked, so that the server can incorporate the system information into the corresponding target information tree according to the type information. If the system description information does not contain the category information of the system information, the system information can be classified and managed according to the system information processing method provided by the application.

And 204, performing synonymous expansion on each original word to generate an expansion word set corresponding to each original word.

The server respectively obtains synonyms corresponding to all original words in the original word set, and the original words and the corresponding synonyms form an expansion word set. There is a corresponding set of expanded terms for each original term. Synonyms refer to words having the same or similar meaning as the original words, such as the original words are "don't", the synonyms can be "don't care", "forbid", "avoid", "stop", etc., the original words and the corresponding synonyms form an expanded word set, such as the expanded word set corresponding to the original words "don't care" is { don't care, forbid, avoid, stop }. If the original word set is { a, b, c }, each original word in the original word set has a corresponding extended word set, if a corresponds to the extended word set { a, a1, a2}, b corresponds to the extended word set { b, b1, b2, b3}, and c corresponds to the extended word set { c, c1, c2 }.

And step 206, forming an expansion system information set corresponding to the system information according to each expansion word set.

And the server randomly selects a word from the expansion word set corresponding to each original word according to the appearance sequence of each original word in the system information, and forms an expansion system information according to the sequence. When different words are selected from the expansion word set, different expansion system information is formed, and the expansion system information set is formed by the different expansion system information.

In one embodiment, the server calculates Cartesian products of the expansion word sets corresponding to the original words to form expansion system information sets consisting of different expansion system information. The Cartesian product, also called the direct product, of the two sets X and Y is denoted X Y. The first object is a member of X and the second object is one of all the possible ordered pairs of Y.

And step 208, inputting the extended system information set into a preset system management model to obtain a target type corresponding to the system information.

The institutional management model is used for determining a target category corresponding to the input from a plurality of candidate types according to the input. The system management model may be a model obtained by training a logistic regression algorithm, a support vector machine algorithm, or the like. The interior of the institutional management model can be formed by connecting a plurality of sub-management models. Because the input of the trained system management model is the expanded system information set, each expanded system information expresses the meaning which is the same as or similar to the system information, and the effective coverage range of the system information is improved, the accuracy of the target category can be improved after the trained system management model is subsequently input.

Step 210, obtaining category labels corresponding to the plurality of target information trees, respectively, screening the target information trees containing the category labels corresponding to the target categories, and adding system information to the screened target information trees.

The server stores a variety of target information trees. As shown in fig. 3, each target information tree includes a plurality of information nodes and a system file associated with each information node. The system file can be a file with various formats, such as pdf document, jpg image, xls table, mp3 audio or avi video, etc. Different information nodes can be arranged in the target information tree according to the issuing time. It is to be understood that one system information may not have an associated system file, and may also have a plurality of associated system files, without limitation.

Each target information tree has a corresponding category label. The category label is used for identifying categories of information nodes which can be contained in the corresponding target information tree, such as an administrative management category, a sales management category or a risk management category. The server obtains the category labels corresponding to the target categories, and screens one or more target information trees containing the obtained category labels. And the server generates an information node according to the system description information. For example, a system number and/or a system name may be used as an information node. And the server associates the system file to the information node, and adds the information node associated with the system file to the target information tree obtained by screening.

In one embodiment, the institutional information includes institutional description information and institutional files; adding system information to the screened target information tree comprises: generating an information node according to the system description information; detecting whether the same information nodes exist in the target information tree obtained by screening; and if the system file does not exist, adding the information node to the corresponding target information tree, and associating the system file to the information node.

If the relevant information nodes already exist in the relevant information tree obtained by screening, the server only needs to correlate the system files to the existing relevant information nodes. In another embodiment, the server judges whether the generated information node belongs to a parallel node or a parent-child node with the existing same information node according to the system description information. When the generated information node and the existing same information node belong to parallel nodes, the server discriminately marks the generated information node and the existing same information node, adds the discriminately marked information node to the corresponding target information tree, and associates the system file with the discriminately marked information node.

When the generated information node and the existing same information node belong to parallel nodes, the server describes and limits the generated information node according to the system description information, namely extracting keywords from the system description information and performing semantic expansion on the generated information node by using the extracted keywords. For example, if the information node generated according to the system name is "company welfare management system", and the keyword "research and development department" is extracted from the system description information, the semantically extended information node may be "company research and development department welfare management system". And the server takes the information nodes after semantic expansion as the existing child nodes of the same information nodes and adds the child nodes to the corresponding target information tree, and associates the system files to the child nodes.

In the embodiment, the system information is segmented to obtain a corresponding original word set by monitoring the newly issued system information; by acquiring synonyms corresponding to all original words in the original word set, an expanded word set can be formed by utilizing the original words and the corresponding synonyms; according to the expansion word set corresponding to each original word, an expansion system information set corresponding to the system information can be formed; inputting the extended system information set into a trained system management model to obtain a target type corresponding to system information; and matching the target type with the type labels respectively corresponding to a plurality of pre-stored target information trees to obtain the target information trees capable of containing the system information by screening, and adding the system information to the target information trees obtained by screening. An expansion word set corresponding to each original word is formed first, and then an expansion system information set is formed through the expansion word set, so that the expansion degree of expansion system information is greatly improved, each expanded system information expresses the meaning which is the same as or similar to the system information, and the effective coverage range of the system information is improved, therefore, after a trained system management model is subsequently input, the accuracy of target categories can be improved, the system information can be accurately brought into a corresponding target information tree, and the system information classification efficiency and accuracy are improved.

In one embodiment, the generating of the institutional management model comprises: acquiring training sample data; the training sample data comprises a plurality of sample system information and respectively corresponding category labels; performing word segmentation and synonymy expansion processing on each sample system information to obtain an expanded sample system information set corresponding to each sample system information; and training an initial system management model through a support vector machine algorithm according to each extended sample system information set and the corresponding category labels to obtain the system management model.

The training sample data can be released multiple sample system information. Each type of sample system information has a corresponding category label for describing the actual category of the sample system information. For example, if the system name corresponding to the sample system information is "attendance note", the category label corresponding to the sample system information may be "administration management". The training sample data comprises sample system information corresponding to all possible categories so as to ensure the accuracy of determining each category. In one specific embodiment, the training sample data comprises 476 sample system information, and the total number of class labels is 57.

The server performs word segmentation on each training sample information through a word segmentation algorithm to obtain each word, and each word forms an original training word set corresponding to each training sample information. And the server acquires the synonym of each original training word and forms an expanded training word set by the original training words and the corresponding synonyms. The extended training term set comprises a plurality of groups

The server firstly obtains one piece of training sample information as current training sample information, obtains each original training word corresponding to the current training sample information, obtains an extended training word set corresponding to each original training word, then randomly selects one word from the extended training word set corresponding to each original training word according to the sequence of appearance of each original training word in the current training sample information, and forms extended sample system information according to the sequence. And forming an extended sample system information set by the different extended sample system information. Each sample system information has a corresponding extended sample system information set. In one embodiment, the server calculates a cartesian product of the extended training word set corresponding to each original training word to form an extended sample system information set corresponding to each sample system information.

The support vector machine algorithm is a machine learning algorithm for pattern recognition and pattern classification. The support vector machine has the main ideas that: and establishing an optimal decision hyperplane, so that the distance between two types of samples which are closest to the plane on two sides of the plane is maximized, thereby providing good generalization capability for classification problems. For a multidimensional sample set, a system randomly generates a hyperplane and continuously moves, samples are classified until sample points belonging to different classes in training samples are just positioned on two sides of the hyperplane, a plurality of hyperplanes meeting the condition are possible, a support vector machine algorithm finds the hyperplane while ensuring the classification precision, so that blank areas on two sides of the hyperplane are maximized, the optimal classification of linear separable samples is realized, and the support vector machine algorithm is a supervised training method. In one embodiment, the institutional management model is formed by a plurality of sub-management model connections.

In the embodiment, a large amount of published system information is subjected to word segmentation and synonymy expansion processing, and the obtained expanded sample system information set is processed, so that the effective coverage range of the sample system information is greatly improved; the system management model is input with the extended sample system information set, and trained based on the support vector machine algorithm, so that the classification accuracy of the system management model can be improved.

In one embodiment, the set of extended sample regime information comprises a plurality of sets of extended sample regime information; according to each extended sample system information set and the corresponding category label, training an initial system management model through a support vector machine algorithm comprises the following steps: acquiring a characteristic item, and calculating the word frequency weight of the characteristic item in a group of extended sample system information; calculating the document frequency of the feature items in the whole training sample data; calculating the characteristic weight corresponding to the characteristic item according to the word frequency weight and the document frequency; selecting the feature items as feature words of corresponding extended sample system information according to the feature weights; and extracting the characteristics of the standard information of each extended sample according to the characteristic words.

The characteristic item may be any word in a set of extended sample regimen information. The term frequency weight refers to the frequency of occurrence of the characteristic item in the set of extended sample system information. It is understood that synonyms for the characteristic items are also present if present in the extended sample regimen information. The word frequency weights are typically normalized and may be expressed as TF_ijWherein i represents the identifier corresponding to the feature item, and j represents the category identifier. Document frequency DF_iThe method is a measurement of the general importance of words, and can be obtained by dividing the number of extension sample system information where the feature item is located by the total number of all training sample information in training sample data.

If the frequency of the characteristic item appearing in the extended sample system information is more, the influence of the characteristic item on the extended sample system information is more, namely the characteristic weight is in direct proportion to the word frequency weight. If the more the number of the extended sample system information appears in the feature item, the smaller the effect of the feature item on information classification is shown, namely, the feature weight is inversely proportional to the document frequency. In one embodiment, the feature weight w_ti＝TF_ij*log(_DN_Fi) Where N represents all training sample information in the training sample dataTotal number of cells.

If the characteristic weight exceeds a preset threshold value, the characteristic item is an important word of the group of expanded sample system information, and the characteristic item can be used as the characteristic word of the expanded sample system information. The characteristics of each expansion sample system information in the expansion sample system information set can be extracted according to each determined characteristic word. For an extended sample regimen information, the characteristic words may include one or more.

In the embodiment, the word frequency weight and the document frequency of each word in the extended sample system information are counted, the characteristic weight of the word representing the extended sample system information is determined, and the characteristics of each extended sample system information in the extended sample system information set can be extracted according to the characteristic weight, so that the system management model can accurately extract the characteristics of the system information based on diversified language description, and further accurately classify the system information.

In one embodiment, the system information comprises system description information and associated system files; the system file comprises a plurality of system clauses and applicable object identifications respectively corresponding to the system clauses; the associated information tree has a corresponding applicable object identifier. The method further comprises the step of constructing a tree of associated information. As shown in fig. 4, the step of constructing the association information tree includes:

and 402, splitting the system file, and generating a system subfile corresponding to the corresponding applicable object identifier by using the system clause corresponding to each applicable object identifier.

In order to meet the working requirements of all posts, enterprises may record system information suitable for different posts into the same system file, so that users can only inquire the system based on all information contents of the system file, and further the system information inquiry efficiency is reduced. The embodiment constructs different association information trees for different posts. Specifically, the server splits the plurality of system clauses in the system file according to the applicable object identifier corresponding to each system clause in the system file, and generates a system subfile corresponding to each applicable object identifier. For example, the system document A comprises four system clauses from X1 to X4. The applicable object identifier corresponding to X1 comprises A and B, the applicable object identifier corresponding to X2 comprises A, the applicable object identifier corresponding to X3 comprises A, B, C, D and E, and the applicable object identifier corresponding to X4 comprises A and D. The system file A comprises five applicable object identifications of A, B, C, D and E, and the corresponding splitting is carried out to obtain five system subfiles A1-A5. The system subfile A1 corresponding to the applicable object identifier A comprises four system clauses X1-X4; the system subfile A2 corresponding to the applicable object identifier B comprises two system clauses of X1 and X3; and so on.

Step 404, a plurality of associated information trees corresponding to the target information tree are obtained.

Each target information tree has a corresponding plurality of associated information trees. Each information node in the target information tree has a corresponding one or more applicable object identifiers. Different applicable object identifications in the target information tree respectively have a corresponding associated information tree. In other words, the number of applicable object identifiers contained in the target information tree is equal to the number of corresponding associated information trees, so that each post corresponding to an applicable object identifier has a corresponding associated information tree.

The target information tree is used for recording system information applicable to all posts of an enterprise. And the associated information tree only needs to record system information suitable for one post. Each associated information tree has a corresponding applicable object identification. As shown in fig. 5, position 1 does not need to execute or know the system corresponding to the information node 4 and the information node 9, and the associated information tree corresponding to the object identifier "position 1" is applied, and there are no information node 4 and no information node 9 in comparison with the target information tree in fig. 3. It is easy to understand that the directory hierarchy of a plurality of information nodes in the associated information tree does not necessarily coincide with the target information tree, and can be adaptively adjusted. The content of system file records associated with other information nodes still existing in the associated information tree can be different from the content of system file records associated with corresponding information nodes in the target information tree.

And 406, adding system description information and system subfiles to the corresponding associated information tree according to the applicable object identifier.

And after the server adds system information to the corresponding target information tree, the server acquires the corresponding associated information tree corresponding to the target information tree according to the applicable object identification recorded by the system file. It is easy to understand that the server only needs to obtain the associated information tree corresponding to the applicable object identifier recorded in the system file. For example, the system information classification is added to three kinds of target information trees including the target information tree M. The applicable object identifier corresponding to the target information tree M includes information contents applicable to a, b, c, d, e, and if the system file only includes information contents applicable to a, b, c, d, and e according to the above example, the server only needs to acquire the associated information trees corresponding to a, b, c, d, and e, respectively, corresponding to the target information tree M.

And the server generates an information node according to the system description information and respectively associates a plurality of system subfiles obtained by splitting to the information node. And the server respectively adds a plurality of information nodes associated with subfiles of different systems to the associated information trees corresponding to the same applicable object identifier. For example, in the above example, an information node associated with system subfile a1 is added to associated information tree M corresponding to the applicable object identifier a in target information tree M_{First of all}(ii) a Adding an information node associated with system subfile A2 to associated information tree M corresponding to applicable object identifier B in target information tree M_{Second step}And so on.

And when a system inquiry request sent by the second terminal is received, the server acquires the associated information tree corresponding to the applicable object identifier. The system inquiry request carries the applicable object identification and the inquiry condition. The server searches the information nodes meeting the query conditions in the associated information tree, acquires system subfiles associated with the information nodes meeting the query conditions, and sends the system subfiles to the second terminal.

In the embodiment, when the scheduling information is published, the system file recorded with the system information suitable for different posts is split, the system terms required to be executed or known by each post are selected, the individual requirements of different posts are met, the associated information trees only containing the content required by the corresponding posts are respectively constructed for different posts, and the generation process of all the associated information trees is fully automatically carried out, so that time and labor are saved; subsequent users only need to carry out system query based on the associated information tree suitable for the users, and system query efficiency can be improved.

In one embodiment, splitting the institutional file comprises: calculating the data volume of the system file, and detecting whether the data volume exceeds a threshold value; when the data volume exceeds a threshold value, acquiring a preset target data volume, and determining the splitting position of a system file according to the target data volume; detecting whether the splitting position is positioned between adjacent separators; when the splitting position is located at a separator, splitting the system file into a plurality of intermediate files at the splitting position; when the splitting position is positioned between the adjacent separators, splitting the system file into a plurality of intermediate files at any one of the adjacent separators; and splitting the plurality of intermediate files according to a preset splitting rule.

And the server calculates the data volume of the system file and detects whether the data volume exceeds a threshold value. The threshold may be set in advance, or may be temporarily generated based on the load monitoring result of the server. When the data volume exceeds the threshold value, the server can divide the system file into a plurality of intermediate files with small data volume in advance, and then divide the intermediate files into a plurality of system subfiles respectively. Specifically, the server obtains a preset target data volume, and determines the splitting position of the system file according to the target data volume. The target data amount may be set in advance, or may be temporarily generated based on the load monitoring result for other servers in the plurality of clusters. For example, the data size of system file a is 720M, and assuming that the target data size is 80M, the 80M-th position of the system file is marked as the first split position, the 160M-th position is marked as the second split position, and so on.

The server identifies whether each split location is located between adjacent delimiters. When the splitting position is located at the position of one separator, the server splits the system file at the splitting position to obtain a plurality of intermediate files corresponding to the system file. When the splitting position is located between the adjacent separators, the server splits the corresponding system file at any one of the adjacent separators, namely, splits the previous separator or the next separator in the adjacent separators to obtain a plurality of intermediate files corresponding to the system file. And the server calls multithreading to split the intermediate file into a plurality of system subfiles according to the mode, or the intermediate file is sent to other servers in the cluster to be split, so that the file splitting efficiency is improved. The system file with larger data volume is split into the intermediate file with smaller data volume and then transmitted to other servers in the cluster for splitting, the data transmission efficiency can also be improved,

in this embodiment, a system file with a large data size is split in two stages: the splitting of the first level is performed according to the data volume, and the splitting of the second level is performed according to a preset splitting dimension; the system file with large data volume is split into the intermediate file with small data volume, and the intermediate file can be split into a plurality of system subfiles in parallel, so that the file splitting efficiency can be improved.

It should be understood that although the steps in the flowcharts of fig. 2 and 4 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2 and 4 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 6, there is provided a degree information processing apparatus including: information expansion module 602, information classification module 604, and information archiving module 606, wherein:

the information expansion module 602 is configured to monitor system information issued by a terminal, and perform word segmentation on the system information to obtain a corresponding original word set; the original word set comprises a plurality of original words; synonymy expanding is carried out on each original word, and an expanded word set corresponding to each original word is generated; and forming an expansion system information set corresponding to the system information according to each expansion word set.

And the information classification module 604 is configured to input the extended system information set into a preset system management model to obtain a target category corresponding to the system information.

The information archiving module 606 is configured to obtain category labels corresponding to the plurality of target information trees, screen a target information tree including a category label corresponding to a target category, and add system information to the screened target information tree.

In one embodiment, the institutional information includes institutional descriptive information; the information extension module 602 is further configured to detect whether regime description information includes category information; if yes, adding system information to a corresponding target information tree according to the category information; otherwise, performing word segmentation on the system information to obtain a corresponding original word set.

In one embodiment, the apparatus further comprises a model training module 608 for obtaining training sample data; the training sample data comprises a plurality of sample system information and respectively corresponding category labels; performing word segmentation and synonymy expansion processing on each sample system information to obtain an expanded sample system information set corresponding to each sample system information; and training an initial system management model through a support vector machine algorithm according to each extended sample system information set and the corresponding category labels to obtain the system management model.

In one embodiment, the set of extended sample regime information comprises a plurality of sets of extended sample regime information; the model training module 608 is further configured to obtain a feature item, and calculate a word frequency weight of the feature item in a set of extended sample system information; calculating the document frequency of the feature items in the whole training sample data; calculating the characteristic weight corresponding to the characteristic item according to the word frequency weight and the document frequency; selecting the feature items as feature words of corresponding extended sample system information according to the feature weights; and extracting the characteristics of the standard information of each extended sample according to the characteristic words.

In one embodiment, the institutional information includes institutional description information and institutional files; the information archiving module 606 is further configured to generate an information node according to the system description information; detecting whether the same information nodes exist in the target information tree obtained by screening; and if the system file does not exist, adding the information node to the corresponding target information tree, and associating the system file to the information node.

In one embodiment, the system information comprises system description information and associated system files; the system file comprises a plurality of system clauses and applicable object identifications respectively corresponding to the system clauses; the associated information tree has corresponding applicable object identification; the information archiving module 606 is further configured to split the system file, and generate a system subfile corresponding to each applicable object identifier by using the system clause corresponding to each applicable object identifier; acquiring a plurality of associated information trees corresponding to a target information tree; and adding system description information and system subfiles to the corresponding associated information tree according to the applicable object identifier.

In one embodiment, information archiving module 606 is further configured to calculate the data volume of the institutional file, detect whether the data volume exceeds a threshold; when the data volume exceeds a threshold value, acquiring a preset target data volume, and determining the splitting position of a system file according to the target data volume; detecting whether the splitting position is positioned between adjacent separators; when the splitting position is located at a separator, splitting the system file into a plurality of intermediate files at the splitting position; when the splitting position is positioned between the adjacent separators, splitting the system file into a plurality of intermediate files at any one of the adjacent separators; and splitting the plurality of intermediate files according to a preset splitting rule.

For specific limitations of the system information processing device, reference may be made to the above limitations of the system information processing method, which are not described herein again. All or part of each module in the system information processing device can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing system information. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of process information processing.

Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, there is provided a computer device comprising a memory storing a computer program and a processor implementing the following steps when the processor executes the computer program: monitoring system information issued by a terminal, and segmenting the system information to obtain a corresponding original word set; the original word set comprises a plurality of original words; synonymy expanding is carried out on each original word, and an expanded word set corresponding to each original word is generated; forming an expansion system information set corresponding to the system information according to each expansion word set; inputting the extended system information set into a preset system management model to obtain a target type corresponding to the system information; obtaining category labels corresponding to the target information trees respectively, screening the target information trees containing the category labels corresponding to the target categories, and adding system information to the screened target information trees.

In one embodiment, the institutional information includes institutional descriptive information; the processor, when executing the computer program, further performs the steps of: detecting whether system description information contains category information or not; if yes, adding system information to a corresponding target information tree according to the category information; otherwise, performing word segmentation on the system information to obtain a corresponding original word set.

In one embodiment, the processor, when executing the computer program, further performs the steps of: acquiring training sample data; the training sample data comprises a plurality of sample system information and respectively corresponding category labels; performing word segmentation and synonymy expansion processing on each sample system information to obtain an expanded sample system information set corresponding to each sample system information; and training an initial system management model through a support vector machine algorithm according to each extended sample system information set and the corresponding category labels to obtain the system management model.

In one embodiment, the set of extended sample regime information comprises a plurality of sets of extended sample regime information; the processor, when executing the computer program, further performs the steps of: acquiring a characteristic item, and calculating the word frequency weight of the characteristic item in a group of extended sample system information; calculating the document frequency of the feature items in the whole training sample data; calculating the characteristic weight corresponding to the characteristic item according to the word frequency weight and the document frequency; selecting the feature items as feature words of corresponding extended sample system information according to the feature weights; and extracting the characteristics of the standard information of each extended sample according to the characteristic words.

In one embodiment, the institutional information includes institutional description information and institutional files; the processor, when executing the computer program, further performs the steps of: generating an information node according to the system description information; detecting whether the same information nodes exist in the target information tree obtained by screening; and if the system file does not exist, adding the information node to the corresponding target information tree, and associating the system file to the information node.

In one embodiment, the system information comprises system description information and associated system files; the system file comprises a plurality of system clauses and applicable object identifications respectively corresponding to the system clauses; the associated information tree has corresponding applicable object identification; the processor, when executing the computer program, further performs the steps of: splitting the system file, and generating a system subfile corresponding to each applicable object identifier by using the system clause corresponding to each applicable object identifier; acquiring a plurality of associated information trees corresponding to a target information tree; and adding system description information and system subfiles to the corresponding associated information tree according to the applicable object identifier.

In one embodiment, the processor, when executing the computer program, further performs the steps of: calculating the data volume of the system file, and detecting whether the data volume exceeds a threshold value; when the data volume exceeds a threshold value, acquiring a preset target data volume, and determining the splitting position of a system file according to the target data volume; detecting whether the splitting position is positioned between adjacent separators; when the splitting position is located at a separator, splitting the system file into a plurality of intermediate files at the splitting position; when the splitting position is positioned between the adjacent separators, splitting the system file into a plurality of intermediate files at any one of the adjacent separators; and splitting the plurality of intermediate files according to a preset splitting rule.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of: monitoring system information issued by a terminal, and segmenting the system information to obtain a corresponding original word set; the original word set comprises a plurality of original words; synonymy expanding is carried out on each original word, and an expanded word set corresponding to each original word is generated; forming an expansion system information set corresponding to the system information according to each expansion word set; inputting the extended system information set into a preset system management model to obtain a target type corresponding to the system information; obtaining category labels corresponding to the target information trees respectively, screening the target information trees containing the category labels corresponding to the target categories, and adding system information to the screened target information trees.

In one embodiment, the institutional information includes institutional descriptive information; the computer program when executed by the processor further realizes the steps of: detecting whether system description information contains category information or not; if yes, adding system information to a corresponding target information tree according to the category information; otherwise, performing word segmentation on the system information to obtain a corresponding original word set.

In one embodiment, the computer program when executed by the processor further performs the steps of: acquiring training sample data; the training sample data comprises a plurality of sample system information and respectively corresponding category labels; performing word segmentation and synonymy expansion processing on each sample system information to obtain an expanded sample system information set corresponding to each sample system information; and training an initial system management model through a support vector machine algorithm according to each extended sample system information set and the corresponding category labels to obtain the system management model.

In one embodiment, the set of extended sample regime information comprises a plurality of sets of extended sample regime information; the computer program when executed by the processor further realizes the steps of: acquiring a characteristic item, and calculating the word frequency weight of the characteristic item in a group of extended sample system information; calculating the document frequency of the feature items in the whole training sample data; calculating the characteristic weight corresponding to the characteristic item according to the word frequency weight and the document frequency; selecting the feature items as feature words of corresponding extended sample system information according to the feature weights; and extracting the characteristics of the standard information of each extended sample according to the characteristic words.

In one embodiment, the institutional information includes institutional description information and institutional files; the computer program when executed by the processor further realizes the steps of: generating an information node according to the system description information; detecting whether the same information nodes exist in the target information tree obtained by screening; and if the system file does not exist, adding the information node to the corresponding target information tree, and associating the system file to the information node.

In one embodiment, the system information comprises system description information and associated system files; the system file comprises a plurality of system clauses and applicable object identifications respectively corresponding to the system clauses; the associated information tree has corresponding applicable object identification; the computer program when executed by the processor further realizes the steps of: splitting the system file, and generating a system subfile corresponding to each applicable object identifier by using the system clause corresponding to each applicable object identifier; acquiring a plurality of associated information trees corresponding to a target information tree; and adding system description information and system subfiles to the corresponding associated information tree according to the applicable object identifier.

In one embodiment, the computer program when executed by the processor further performs the steps of: calculating the data volume of the system file, and detecting whether the data volume exceeds a threshold value; when the data volume exceeds a threshold value, acquiring a preset target data volume, and determining the splitting position of a system file according to the target data volume; detecting whether the splitting position is positioned between adjacent separators; when the splitting position is located at a separator, splitting the system file into a plurality of intermediate files at the splitting position; when the splitting position is positioned between the adjacent separators, splitting the system file into a plurality of intermediate files at any one of the adjacent separators; and splitting the plurality of intermediate files according to a preset splitting rule.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of process scheduling information, the method comprising:

monitoring system information issued by a terminal, and segmenting the system information to obtain a corresponding original word set; the original set of terms comprises a plurality of original terms; the system information comprises system description information and associated system files; the system file comprises a plurality of system clauses and applicable object identifications respectively corresponding to the system clauses;

synonymy expanding is carried out on each original word, and an expanded word set corresponding to each original word is generated;

forming an extended system information set corresponding to the system information according to each extended word set;

inputting the extended system information set into a preset system management model to obtain a target type corresponding to the system information;

obtaining category labels corresponding to a plurality of target information trees respectively, screening the target information trees containing the category labels corresponding to the target categories, and adding the system information to the screened target information trees;

the method further comprises the following steps:

splitting the system file, and generating system subfiles corresponding to the corresponding applicable object identifications by using the system clauses corresponding to each applicable object identification;

acquiring a plurality of associated information trees corresponding to the target information tree; the associated information tree is provided with corresponding applicable object identifications, and different applicable object identifications in the target information tree are respectively provided with a corresponding associated information tree;

and generating information nodes according to the system description information, respectively associating a plurality of system subfiles obtained by splitting to the information nodes, and respectively adding a plurality of information nodes associated with different system subfiles to the associated information trees corresponding to the same applicable object identifier.

2. The method of claim 1, wherein the institutional information comprises institutional descriptive information; before the word segmentation is performed on the system information to obtain a corresponding original word set, the method further comprises the following steps:

detecting whether the system description information contains category information or not;

if yes, adding the system information to a corresponding target information tree according to the category information;

otherwise, performing word segmentation on the system information to obtain a corresponding original word set.

3. The method of claim 1, wherein the generating step of the institutional management model comprises: acquiring training sample data; the training sample data comprises a plurality of sample system information and respectively corresponding category labels;

performing word segmentation and synonymous expansion processing on each sample system information to obtain an expanded sample system information set corresponding to each sample system information;

and training an initial system management model through a support vector machine algorithm according to each extended sample system information set and the corresponding category labels to obtain the system management model.

4. The method of claim 3, wherein the set of extended sample regime information comprises a plurality of sets of extended sample regime information; according to each extended sample system information set and the corresponding category label, training an initial system management model through a support vector machine algorithm comprises the following steps:

acquiring a characteristic item, and calculating the word frequency weight of the characteristic item in a group of extended sample system information;

calculating the document frequency of the feature items in the whole training sample data;

calculating the characteristic weight corresponding to the characteristic item according to the word frequency weight and the document frequency;

selecting the characteristic item as a characteristic word of corresponding extended sample system information according to the characteristic weight;

and extracting the characteristics of each extended sample system information according to the characteristic words.

5. The method of claim 1, wherein; the system information comprises system description information and system files; the adding the system information to the target information tree obtained by screening comprises the following steps:

generating an information node according to the system description information;

detecting whether the same information nodes exist in the target information tree obtained by screening;

and if the system file does not exist, adding the information node to the corresponding target information tree, and associating the system file to the information node.

6. The method of claim 1, wherein the splitting the institutional file comprises:

calculating the data volume of the system file, and detecting whether the data volume exceeds a threshold value;

when the data volume exceeds a threshold value, acquiring a preset target data volume, and determining the splitting position of the system file according to the target data volume;

detecting whether the split location is located between adjacent separators;

when the splitting position is located at a separator, splitting the system file into a plurality of intermediate files at the splitting position;

when the splitting position is located between adjacent separators, splitting the system file into a plurality of intermediate files at any one of the adjacent separators;

and splitting the plurality of intermediate files according to a preset splitting rule.

7. An apparatus for processing production information, the apparatus comprising:

the system comprises an information expansion module, a word segmentation module and a word segmentation module, wherein the information expansion module is used for monitoring system information issued by a terminal and segmenting the system information to obtain a corresponding original word set; the original set of terms comprises a plurality of original terms; synonymy expanding is carried out on each original word, and an expanded word set corresponding to each original word is generated; forming an extended system information set corresponding to the system information according to each extended word set; the system information comprises system description information and associated system files; the system file comprises a plurality of system clauses and applicable object identifications respectively corresponding to the system clauses;

the information filing module is used for acquiring class labels corresponding to a plurality of target information trees respectively, screening the target information trees containing the class labels corresponding to the target classes, and adding the system information to the screened target information trees;

the information filing module is also used for splitting the system file and generating system subfiles corresponding to the corresponding applicable object identifications by using the system clauses corresponding to each applicable object identification;

8. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 6 when executing the computer program.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.