CN115034476A - Project risk prediction method, device, equipment, medium and program product - Google Patents

Project risk prediction method, device, equipment, medium and program product Download PDF

Info

Publication number
CN115034476A
CN115034476A CN202210661280.5A CN202210661280A CN115034476A CN 115034476 A CN115034476 A CN 115034476A CN 202210661280 A CN202210661280 A CN 202210661280A CN 115034476 A CN115034476 A CN 115034476A
Authority
CN
China
Prior art keywords
text
historical production
product application
project
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210661280.5A
Other languages
Chinese (zh)
Inventor
贝飞
杨玉新
王诗章
安卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202210661280.5A priority Critical patent/CN115034476A/en
Publication of CN115034476A publication Critical patent/CN115034476A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The disclosure provides a project risk prediction method which can be applied to the technical field of artificial intelligence. The project risk prediction method comprises the following steps: acquiring a demand text of a risk to be predicted in a project and a product application related to the demand text; calculating the similarity between the demand text and a historical production problem set to obtain at least one historical production problem of which the similarity with the demand text is higher than a preset threshold value, wherein the historical production problem set comprises problem texts corresponding to the historical production problems; performing frequent item calculation based on the historical production problem set to obtain at least one risk product application associated with the product application; outputting the at least one historical production problem and the at least one risky product application to prompt the project risk associated with the requirement text. The project risk prediction method can improve the risk prediction efficiency and has the prediction comprehensiveness. The present disclosure also provides a project risk prediction apparatus, device, storage medium and program product.

Description

Project risk prediction method, device, equipment, medium and program product
Technical Field
The present disclosure relates to the field of artificial intelligence, in particular to the field of risk prediction, and more particularly to a method, an apparatus, a device, a medium, and a program product for project risk prediction.
Background
In the project management process, in order to quickly adapt to business requirements, the project development process is gradually converted from a traditional mode to agile development so as to adapt to the market requirement of quick online products. Each project consists of a plurality of requirement items, and each requirement item is jointly developed and realized by one or more product applications. After the project is on line, if the production has a problem, the problem source is analyzed, and the product application name generating the problem is marked. If similar production problems appear in the current transformation content in the project construction process, the possible production risks of the project can be mainly verified in the test process, and the operation stability of the project after the project is on line is greatly improved.
In the aspect of predicting the project risk, the traditional method depends on a project developer to review the development function of the project developer, and potential problems which may exist are proposed as the project risk concerned by the test. Or depending on a test manager or expert personnel with abundant working experience, pre-judging the possible production risks according to the requirement description, and evaluating the potential risks after the on-line. Both methods completely depend on manual processing, the requirements on the working experience of risk identification people are high, and for the bank project with high iteration speed at present, human errors are easily caused when the workload of a project manager is excessive.
In summary, in the project management process, there is no method capable of fully releasing manpower and automatically predicting project risks, and how to find a solution capable of predicting the project risks that may appear from the existing production problems is a technical problem to be solved in the field.
Disclosure of Invention
In view of the foregoing, the present disclosure provides project risk prediction methods, apparatuses, devices, media, and program products that improve risk prediction efficiency.
According to a first aspect of the present disclosure, there is provided a project risk prediction method, comprising: acquiring a demand text of a risk to be predicted in a project and a product application related to the demand text; calculating the similarity between the demand text and a historical production problem set to obtain at least one historical production problem of which the similarity with the demand text is higher than a preset threshold value, wherein the historical production problem set comprises problem texts corresponding to the historical production problems; performing frequent item calculation based on the historical production problem set to obtain at least one risk product application associated with the product application; outputting the at least one historical production problem and the at least one risky product application to prompt the project risk associated with the requirement text.
According to the embodiment of the disclosure, calculating the similarity between the demand text and the historical production problem set, and obtaining at least one historical production problem with the similarity higher than a predetermined threshold with the demand text comprises: extracting a plurality of first keywords of a demand text, and extracting a plurality of second keywords of a question text of each historical production question; vectorizing a plurality of first keywords to obtain a first feature vector, and vectorizing a plurality of second keywords of each historical production problem to obtain a plurality of second feature vectors; and respectively calculating the similarity between the first feature vector and each second feature vector to obtain the similarity between the demand text and each question text so as to determine at least one historical production question with the similarity higher than a preset threshold value with the demand text.
According to an embodiment of the present disclosure, extracting a plurality of first keywords of a requirement text, and extracting a plurality of second keywords of a question text of each historical production question comprises: respectively carrying out word segmentation on the demand text and the problem text of each historical production problem; the part of speech is labeled to the requirement text and each question text to filter irrelevant words; matching the required text with a preset keyword set to obtain a plurality of first keywords of the required text; and matching each historical production question with the keyword set to obtain a plurality of second keywords of each question text.
According to the embodiment of the present disclosure, vectorizing a plurality of first keywords to obtain a first feature vector, and vectorizing a plurality of second keywords of each historical production problem, respectively, to obtain a plurality of second feature vectors includes: obtaining a user-defined parameter value of each first keyword and each second keyword; calculating the weight value of each first keyword and each second keyword according to the TF-IDF algorithm and the user-defined parameter value; and forming a plurality of second feature vectors by using the weighted values of a plurality of second keywords of each question text.
According to the embodiment of the disclosure, obtaining the custom parameter value of each first keyword and each second keyword comprises: judging whether the first keyword or the second keyword belongs to a preset professional vocabulary set or not; when the first keyword or the second keyword belongs to the professional vocabulary, setting the custom parameter value as a specific value; and when the first keyword or the second keyword does not belong to the professional vocabulary, setting the self-defined parameter value to be 1.
According to the embodiment of the disclosure, performing frequent item calculation based on a historical production problem set to obtain at least one risk product application associated with a product application comprises: obtaining the product application related to each historical production problem from each problem text to obtain a product application set; performing frequent item calculation on the product application set to obtain a plurality of frequent item sets, wherein each frequent item set comprises a plurality of product applications with the association degree higher than a preset threshold value in the product application set; and matching the product application related to the demand text with the frequent item set to obtain at least one risk product application.
According to an embodiment of the present disclosure, performing frequent item calculation based on the historical production problem set, obtaining at least one risky product application associated with the product application further comprises: acquiring the service type of each historical production problem from each problem text; and dividing the historical production problem set into a plurality of business problem sets according to the business types so as to narrow the range of a product application set for performing frequent item calculation.
According to an embodiment of the present disclosure, further comprising: and respectively carrying out standardization processing on the requirement text and each problem text to obtain a text which accords with a preset rule.
A second aspect of the present disclosure provides a project risk prediction apparatus, including: the system comprises an acquisition module, a prediction module and a prediction module, wherein the acquisition module is used for acquiring a demand text of a risk to be predicted in a project and a product application related to the demand text; the similarity calculation module is used for calculating the similarity between the demand text and the historical production problem set to obtain at least one historical production problem of which the similarity with the demand text is higher than a preset threshold value, wherein the historical production problem set comprises problem texts corresponding to the historical production problems; the frequent item calculation module is used for performing frequent item calculation based on the historical production problem set to obtain at least one risk product application associated with the product application; and the output module is used for outputting at least one historical production problem and at least one risk product application so as to prompt the project risk associated with the requirement text.
A third aspect of the present disclosure provides an electronic device, comprising: one or more processors; a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the above project risk prediction method.
A fourth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above project risk prediction method.
A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the above item risk prediction method.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which proceeds with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of a project risk prediction method, apparatus, device, medium, and program product according to embodiments of the disclosure;
FIG. 2 schematically illustrates a flow chart of a project risk prediction method according to an embodiment of the present disclosure;
fig. 3 schematically shows a detailed flowchart of S220 of the project risk prediction method according to an embodiment of the present disclosure;
fig. 4 schematically shows a detailed flowchart of S230 of the project risk prediction method according to an embodiment of the present disclosure;
FIG. 5 schematically shows a block diagram of a project risk prediction apparatus according to an embodiment of the present disclosure; and
FIG. 6 schematically shows a block diagram of an electronic device adapted to implement a project risk prediction method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
The embodiment of the disclosure provides a project risk prediction method, which is applied to the field of artificial intelligence and comprises the following steps: acquiring a demand text of a risk to be predicted in a project and a product application related to the demand text; calculating the similarity between the demand text and a historical production problem set to obtain at least one historical production problem of which the similarity with the demand text is higher than a preset threshold value, wherein the historical production problem set comprises problem texts corresponding to the historical production problems; performing frequent item calculation based on the historical production problem set to obtain at least one risk product application associated with the product application; outputting the at least one historical production problem and the at least one risky product application to prompt the project risk associated with the requirement text. The project risk prediction method can comprehensively predict the potential project risk of the demand project in production based on the existing historical production problem set, does not depend on manual judgment completely, and achieves great improvement of the operation stability of the project after the project is on line.
Fig. 1 schematically shows an application scenario diagram of a project risk prediction method according to an embodiment of the present disclosure.
As shown in fig. 1, the application scenario 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use terminal devices 101, 102, 103 to interact with a server 105 over a network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the project risk prediction method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the project risk prediction apparatus provided by the embodiments of the present disclosure may be generally disposed in the server 105. The project risk prediction method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the project risk prediction device provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation.
The project risk prediction method of the disclosed embodiment will be described in detail below with reference to fig. 2 to 4 based on the scenario described in fig. 1.
FIG. 2 schematically shows a flow diagram of a project risk prediction method according to an embodiment of the disclosure.
As shown in fig. 2, the project risk prediction method of the embodiment includes operations S210 to S240, and the project risk prediction method may be performed by the server 105.
In operation S210, a demand text of a risk to be predicted in a project and a product application to which the demand text relates are acquired.
In software project management/development, each project consists of a plurality of requirement items, each requirement item comprises a requirement text written by a project developer according to project requirements and requirement rules, and the requirement text describes contents such as project names, service types, service scenes, requirement descriptions and related product applications.
In the embodiment of the disclosure, the requirement items of the risk to be predicted are all effective requirement items, the effective requirement items refer to the requirement items of which the project approval states are countersigning passes, and the requirement texts of the effective requirement items are obtained to predict the risk.
In operation S220, a similarity between the requirement text and a historical production problem set is calculated to obtain at least one historical production problem having a similarity higher than a predetermined threshold with the requirement text, where the historical production problem set includes problem texts corresponding to the historical production problems.
The historical production problem set is a set of historical production problems solved in a historical project, and the historical production problems complete production problem root cause analysis in the historical project development process. Each historical production problem has a corresponding problem text, and the problem text describes the problem classification, the problem description, the related product application, the problem root cause analysis, the problem generation time and the like of the historical production problem.
According to an embodiment of the present disclosure, further comprising: and respectively carrying out standardization processing on the requirement text and each problem text to obtain a text which accords with a preset rule.
According to a preset rule, analyzing and constructing the attribute of the requirement text and each problem text to form text data in a standardized format, and regularly outputting the text data in the standardized format to be converted into a modeling data source convenient for modeling identification. Through standardized processing of the demand text and each problem text, subsequent data calculation is facilitated, and the efficiency and accuracy of risk prediction are improved.
Fig. 3 schematically shows a detailed flowchart of S220 of the project risk prediction method according to an embodiment of the present disclosure.
As shown in fig. 3, operation S220 of the project risk prediction method of this embodiment specifically includes the following operations S221 to S223.
In operation S221, a plurality of first keywords of the requirement text are extracted, and a plurality of second keywords of the question text of each historical production question are extracted.
According to an embodiment of the present disclosure, extracting a plurality of first keywords of a demand text, and extracting a plurality of second keywords of a question text of each historical production question includes: respectively carrying out word segmentation on the demand text and the problem text of each historical production problem; the part of speech is labeled to the requirement text and each question text to filter irrelevant words; matching the required text with a preset keyword set to obtain a plurality of first keywords of the required text; and matching each historical production question with the keyword set to obtain a plurality of second keywords of each question text.
In the embodiment of the disclosure, basic chinese word segmentation is performed on a modeling data source of a required text and each problem text by using a CRF (conditional random field algorithm), a value set is set for calculating a labeling probability between words, and a part-of-speech is labeled to facilitate the chinese word segmentation, for example, parts-of-speech such as nouns and verbs can be labeled by using a FudanNLP toolkit, and useless words such as special symbols, adjectives and auxiliary words are filtered, so that dimension reduction is performed on a word vector model established subsequently, and efficiency of project risk prediction is improved. And screening the requirement text and each problem text by using the existing keyword set so as to determine a first keyword and a second keyword.
In operation S222, a plurality of first keywords are vectorized to obtain a first feature vector, and a plurality of second keywords of each historical production problem are vectorized to obtain a plurality of second feature vectors.
According to the embodiment of the present disclosure, vectorizing a plurality of first keywords to obtain a first feature vector, and vectorizing a plurality of second keywords of each historical production problem, respectively, to obtain a plurality of second feature vectors includes: obtaining a user-defined parameter value of each first keyword and each second keyword; calculating the weight value of each first keyword and each second keyword according to a TF-IDF (term frequency-inverse text frequency index) algorithm and a custom parameter value; and forming a plurality of second feature vectors by using the weighted values of a plurality of second keywords of each question text.
The word frequency TF is the proportion of each keyword to all keywords, and the inverse text frequency index IDF reflects the frequency of occurrences of the text containing the keyword in all texts. For example, the first feature vector comprises weighted values of M first keywords, and the ith first keyword in the first feature vector is t i I is more than or equal to 1 and less than or equal to M, and the word frequency of the first keyword in the keyword set is f i The question text containing the first keyword has n i If the total number of the question texts is N, the first keyword t i Inverse text frequency Index of (IDF) i =log(N/n i )。
According to an embodiment of the present disclosure, obtaining the custom parameter value of each first keyword and each second keyword includes: judging whether the first keyword or the second keyword belongs to a preset professional vocabulary set or not; when the first keyword or the second keyword belongs to the professional vocabulary, setting the self-defined parameter value as a specific value; and when the first keyword or the second keyword does not belong to the professional vocabulary, setting the self-defined parameter value to be 1.
For example, in the banking industry, setting a custom parameter value to k, and judging whether the extracted keyword belongs to a bank professional vocabulary, if the keyword belongs to the bank professional vocabulary, the value of k is 100, otherwise, the value of k is 1. Calculating the weight value of each keyword by combining the TF-IDF algorithm and the custom parameter value k, for example, setting the first keyword t i Has a weight value of w i ,w i =k*f i *IDF i Similarly, the weighted value w of each second keyword in the second feature vector i ' also calculated using this method.
In operation S223, the similarity between the first feature vector and each second feature vector is respectively calculated to obtain the similarity between the requirement text and each question text, so as to determine at least one historical production question whose similarity to the requirement text is higher than a predetermined threshold.
For example, a certain desired text S is set 1 The corresponding first feature vector is S 1 (w 1 ,w 2 ,……,w n ) Some question text T 1 =T 1 (w 1 ’,w 2 ’,……,w n ') set requirement text S 1 And question text T 1 The similarity between the two is Sim (S) 1 ,T 1 ) The similarity is calculated by using the cosine law,
Figure BDA0003690443850000091
because artificial subjective habits exist in the writing of the requirement text and the problem text, the similarity judgment requirement is adjusted by using a threshold, the similarity threshold is set as p, and Sim (S) is compared 1 ,T 1 ) And p, when Sim (S) 1 ,T 1 ) If the sum is more than p, the requirement text S is considered 1 Corresponding requirement item and question text T 1 And if not, the association does not exist by default. By using the method, all historical production problems with the similarity higher than a preset threshold value with the requirement text are calculated. Based on the original data of the production problems, the most concerned risk hidden dangers of the project can be truly reflected through the existing historical production problem set, the artificial judgment of experts or test managers and the like with abundant experience is not completely relied on, and the artificial dependence in the project risk prediction process is eliminated.
In operation S230, frequent item calculations are performed based on the historical production problem set, resulting in at least one risky product application associated with the product application.
Fig. 4 schematically shows a detailed flowchart of S230 of the project risk prediction method according to an embodiment of the present disclosure.
As shown in fig. 4, operation S230 of the project risk prediction method of this embodiment specifically includes the following operations S231 to S233.
In operation S231, the product application related to each historical production problem is obtained from the respective problem text, resulting in a product application set.
According to the embodiment of the present disclosure, performing frequent item calculation based on the historical production problem set, and obtaining at least one risky product application associated with the product application further includes: acquiring the service type of each historical production problem from each problem text; and dividing the historical production problem set into a plurality of service problem sets according to the service types so as to narrow the range of the product application set for performing frequent item calculation.
In the embodiment of the disclosure, the problem texts are classified according to the service types to obtain a plurality of service problem sets, and then the product applications related to the historical production problems are obtained from the problem texts in the service problem sets, so as to obtain the product application sets of the service problem sets.
In operation S232, frequent item calculation is performed on the product application set to obtain a plurality of frequent item sets, where each frequent item set includes a plurality of product applications in the product application set, and the association degree of each product application is higher than a predetermined threshold.
In the following, frequent item calculations are performed on the product application sets of each business problem set.
Based on an FP-growth (frequency Pattern-growth) algorithm, a support degree parameter and a confidence coefficient parameter are set autonomously to control the number of Frequent items which can be generated by a current application set and the accuracy degree of an association rule, and the support degree parameter is set to be 0.5, namely the proportion of the number of records containing the item set in the data set to the whole data set is more than or equal to 0.5; and setting the confidence coefficient parameter to be 0.75, namely, the proportion of the number of records of the item set to the number of records of each item is more than or equal to 0.75.
And calculating a frequent item set in the current data set, namely the frequent item set of the product application set of each business problem set. Generally, an item set can be considered to be an infrequent item set, and all supersets thereof are also infrequent, so when the number of items in the set is greater than 0, a list of item sets consisting of k items is constructed in a loop, the item sets greater than or equal to the support degree n are kept, and a list of item sets consisting of k +1 items is generated until the number of items is equal to 0, and the initial value k is equal to 1. A frequent itemset of product applications for the current dataset is output. The table is an example table of the frequent set of the associated application, and frequent item generation results with the support degrees of 0.5 and 0.7 are respectively shown.
And further calculating the record number of the item set in each item according to the obtained frequent item set, comparing the record number with the set confidence coefficient parameter value, and if the record number is greater than or equal to the set confidence coefficient parameter value, defaulting to be the association rule meeting the condition. And the second table is a frequent item set example table under the association rule, and shows rule generation results with confidence degrees of 0.7 and 0.6 respectively. For example, { F-a } - > { F-c } then means that { F-c } may occur when an application is present, and by this means that there may be a choice of which requirement item there may be a risk that the application { F-c } involves when there is a requirement item that relates to an application as { F-a }.
Table one:
degree of support Frequent itemset
0.5 {F-b,F-c,F-e},{F-a,F-c},{F-b,F-e},{F-b,F-c},{F-c,F-e},{F-a},{F-b},{F-c},{F-e}
0.7 {F-b,F-e},{F-b},{F-c},{F-e}
Table two:
Figure BDA0003690443850000111
in operation S233, the product application related to the requirement text is matched with the frequent itemset to obtain at least one risky product application.
By matching the product application related to the requirement text with the frequent item set obtained in the operation, the risk product application associated with the product application can be obtained, the application set with the association is automatically mined, the accuracy of the association rule is controlled by setting a threshold value for the frequent item rule, the risk of omitting the association application in the project risk prediction process is avoided, the comprehensiveness of the risk prediction is greatly improved, the prospective prediction capability is realized, and the time cost for communicating with all parties of application development can be greatly saved.
At operation S240, at least one historical production issue and at least one risky product application are output to prompt the project risk associated with the requirement text.
Table three schematically shows the project risk prediction results.
In addition, the project risk prediction result can be manually intervened, one or more production problems which are definitely related are associated for the demand text of the risk to be predicted through manual entry of association rules, the calculation cost is reduced, and the production problems are manually adjusted or re-associated to finally generate the project risk prediction result which is more in line with the reality.
Table three:
Figure BDA0003690443850000121
the project risk prediction method provided by the disclosure truly reflects the most concerned risk hidden danger of the project based on the existing historical production problem set, realizes no complete dependence on manual judgment, liberates manpower and improves risk prediction efficiency. The project risk prediction method can automatically dig out application sets with associations through frequent item calculation, set thresholds for frequent item rules to control accuracy of the association rules, avoid risks of omitting the association applications in the project risk identification process, and greatly improve comprehensiveness of risk prediction.
Based on the project risk prediction method, the disclosure also provides a project risk prediction device. The apparatus will be described in detail below with reference to fig. 5.
Fig. 5 schematically shows a block diagram of a project risk prediction apparatus according to an embodiment of the present disclosure.
As shown in fig. 5, the project risk prediction apparatus 500 of this embodiment includes an acquisition module 510, a similarity calculation module 520, a frequent item calculation module 530, and an output module 540.
The obtaining module 510 is configured to obtain a requirement text of a risk to be predicted in a project and a product application to which the requirement text relates. In an embodiment, the obtaining module 510 may be configured to perform the operation S210 described above, which is not described herein again.
The similarity calculation module 520 is configured to calculate a similarity between the requirement text and a historical production problem set, and obtain at least one historical production problem whose similarity with the requirement text is higher than a predetermined threshold, where the historical production problem set includes problem texts corresponding to the historical production problems. In an embodiment, the similarity calculating module 520 may be configured to perform the operation S220 described above, which is not described herein again.
The frequent item calculation module 530 is configured to perform frequent item calculation based on the historical production problem set, and obtain at least one risky product application associated with the product application. In an embodiment, the frequent item calculating module 530 may be configured to perform the operation S230 described above, which is not described herein again.
The output module 540 is configured to output the at least one historical production issue and the at least one risky product application to prompt the project risk associated with the requirement text. In an embodiment, the output module 540 may be configured to perform the operation S240 described above, and is not described herein again.
According to the embodiment of the present disclosure, any plurality of the obtaining module 510, the similarity calculating module 520, the frequent item calculating module 530, and the outputting module 540 may be combined and implemented in one module, or any one of the modules may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the obtaining module 510, the similarity calculating module 520, the frequent item calculating module 530, and the output module 540 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware, and firmware, or by a suitable combination of any several of them. Alternatively, at least one of the acquisition module 510, the similarity calculation module 520, the frequent item calculation module 530 and the output module 540 may be at least partially implemented as a computer program module, which when executed, may perform a corresponding function.
FIG. 6 schematically shows a block diagram of an electronic device adapted to implement a project risk prediction method according to an embodiment of the present disclosure.
As shown in fig. 6, an electronic device 600 according to an embodiment of the present disclosure includes a processor 601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. Processor 601 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 601 may also include onboard memory for caching purposes. Processor 601 may include a single processing unit or multiple processing units for performing different actions of a method flow according to embodiments of the disclosure.
In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are stored. The processor 601, the ROM 602, and the RAM603 are connected to each other via a bus 604. The processor 601 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 602 and/or RAM 603. It is to be noted that the programs may also be stored in one or more memories other than the ROM 602 and RAM 603. The processor 601 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.
Electronic device 600 may also include input/output (I/O) interface 605, input/output (I/O) interface 605 also connected to bus 604, according to an embodiment of the disclosure. The electronic device 600 may also include one or more of the following components connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
The present disclosure also provides a computer-readable storage medium, which may be embodied in the device/apparatus/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement a project risk prediction method according to an embodiment of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 602 and/or RAM603 described above and/or one or more memories other than the ROM 602 and RAM 603.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize the project risk prediction method provided by the embodiment of the disclosure.
The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 601. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of a signal on a network medium, downloaded and installed through the communication section 609, and/or installed from the removable medium 611. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program, when executed by the processor 601, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (12)

1. A method of project risk prediction, comprising:
acquiring a demand text of a risk to be predicted in a project and a product application related to the demand text;
calculating the similarity between the demand text and a historical production problem set to obtain at least one historical production problem of which the similarity with the demand text is higher than a preset threshold value, wherein the historical production problem set comprises problem texts corresponding to all the historical production problems;
performing frequent item calculation based on the historical production problem set to obtain at least one risk product application associated with the product application;
outputting the at least one historical production issue and the at least one risky product application to prompt for project risks associated with the requirement text.
2. The project risk prediction method of claim 1, wherein the calculating of the similarity between the demand text and the historical production problem set to obtain at least one historical production problem with the similarity to the demand text higher than a predetermined threshold comprises:
extracting a plurality of first keywords of the requirement text, and extracting a plurality of second keywords of the problem text of each historical production problem;
vectorizing the plurality of first keywords to obtain a first feature vector, an
Vectorizing a plurality of second keywords of each historical production problem respectively to obtain a plurality of second feature vectors;
and respectively calculating the similarity between the first characteristic vector and each second characteristic vector to obtain the similarity between the demand text and each problem text so as to determine at least one historical production problem of which the similarity with the demand text is higher than a preset threshold value.
3. The project risk prediction method of claim 2, wherein extracting a plurality of first keywords of the demand text and a plurality of second keywords of the problem text of each historical production problem comprises:
respectively carrying out word segmentation on the demand text and the problem text of each historical production problem;
the part of speech is labeled to the requirement text and each question text to filter irrelevant words;
matching the required text with a preset keyword set to obtain a plurality of first keywords of the required text;
and matching the historical production questions with the keyword set to obtain a plurality of second keywords of each question text.
4. The project risk prediction method of claim 2, the vectorizing the first keywords to obtain a first feature vector, an
Vectorizing a plurality of second keywords of each historical production problem respectively to obtain a plurality of second feature vectors, wherein the obtaining of the plurality of second feature vectors comprises:
obtaining a user-defined parameter value of each first keyword and each second keyword;
calculating the weight value of each first keyword and each second keyword according to the TF-IDF algorithm and the user-defined parameter value;
and forming a first feature vector by using the weighted values of the first keywords, and forming a plurality of second feature vectors by using the weighted values of the second keywords of each question text.
5. The project risk prediction method of claim 4, the obtaining custom parameter values for each first keyword and each second keyword comprising:
judging whether the first keyword or the second keyword belongs to a preset professional vocabulary set or not;
when the first keyword or the second keyword belongs to the professional vocabulary set, setting the custom parameter value as a specific value;
and when the first keyword or the second keyword does not belong to the professional vocabulary set, setting the self-defined parameter value to be 1.
6. The project risk prediction method of claim 1, wherein the performing frequent item calculations based on the historical production problem set, resulting in at least one risky product application associated with the product application comprises:
obtaining product applications related to each historical production problem from each problem text to obtain a product application set;
performing frequent item calculation on the product application set to obtain a plurality of frequent item sets, wherein each frequent item set comprises a plurality of product applications with the association degree higher than a preset threshold value in the product application set;
and matching the product application related to the demand text with the frequent item set to obtain the at least one risk product application.
7. The project risk prediction method of claim 6, wherein the performing frequent item calculations based on the set of historical production problems, resulting in at least one risky product application associated with the product application further comprises:
acquiring the service type of each historical production problem from each problem text;
and dividing the historical production problem set into a plurality of service problem sets according to the service types so as to narrow the range of a product application set for frequent item calculation.
8. The project risk prediction method of claim 1, further comprising:
and respectively carrying out standardization processing on the requirement text and each problem text to obtain a text which accords with a preset rule.
9. A project risk prediction apparatus comprising:
the system comprises an acquisition module, a prediction module and a prediction module, wherein the acquisition module is used for acquiring a demand text of a risk to be predicted in a project and a product application related to the demand text;
the similarity calculation module is used for calculating the similarity between the demand text and a historical production problem set to obtain at least one historical production problem of which the similarity with the demand text is higher than a preset threshold value, wherein the historical production problem set comprises problem texts corresponding to the historical production problems;
the frequent item calculation module is used for performing frequent item calculation based on the historical production problem set to obtain at least one risk product application associated with the product application; and
an output module for outputting the at least one historical production issue and the at least one risky product application to prompt the project risk associated with the requirement text.
10. An electronic device, comprising:
one or more processors;
a storage device to store one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-8.
11. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 8.
12. A computer program product comprising a computer program which, when executed by a processor, carries out the method according to any one of claims 1 to 8.
CN202210661280.5A 2022-06-13 2022-06-13 Project risk prediction method, device, equipment, medium and program product Pending CN115034476A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210661280.5A CN115034476A (en) 2022-06-13 2022-06-13 Project risk prediction method, device, equipment, medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210661280.5A CN115034476A (en) 2022-06-13 2022-06-13 Project risk prediction method, device, equipment, medium and program product

Publications (1)

Publication Number Publication Date
CN115034476A true CN115034476A (en) 2022-09-09

Family

ID=83124738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210661280.5A Pending CN115034476A (en) 2022-06-13 2022-06-13 Project risk prediction method, device, equipment, medium and program product

Country Status (1)

Country Link
CN (1) CN115034476A (en)

Similar Documents

Publication Publication Date Title
US11030228B2 (en) Contextual interestingness ranking of documents for due diligence in the banking industry with topicality grouping
CN113986864A (en) Log data processing method and device, electronic equipment and storage medium
CN112990281A (en) Abnormal bid identification model training method, abnormal bid identification method and abnormal bid identification device
US11593385B2 (en) Contextual interestingness ranking of documents for due diligence in the banking industry with entity grouping
CN113392200A (en) Recommendation method and device based on user learning behaviors
CN116155628B (en) Network security detection method, training device, electronic equipment and medium
US20230085599A1 (en) Method and device for training tag recommendation model, and method and device for obtaining tag
CN116048463A (en) Intelligent recommendation method and device for content of demand item based on label management
CN115795345A (en) Information processing method, device, equipment and storage medium
CN113869904B (en) Suspicious data identification method, device, electronic equipment, medium and computer program
CN114676694A (en) Method, device, equipment, medium and program product for generating business model
CN111368036B (en) Method and device for searching information
CN114638221A (en) Business model generation method and device based on business requirements
CN114358024A (en) Log analysis method, apparatus, device, medium, and program product
CN113095078A (en) Associated asset determination method and device and electronic equipment
CN115034476A (en) Project risk prediction method, device, equipment, medium and program product
US20200167387A1 (en) Method and system for streamlined auditing
CN117172632B (en) Enterprise abnormal behavior detection method, device, equipment and storage medium
CN115438151A (en) Method, device, equipment and medium for determining standard clauses
CN114816339A (en) Demand architecture analysis method, apparatus, device, medium, and program product
CN113935334A (en) Text information processing method, device, equipment and medium
CN115688725A (en) Report frame template generation method and device, electronic equipment and medium
CN115689263A (en) Information generation method, device, equipment and storage medium
CN115062110A (en) Text processing method and device, electronic equipment and medium
CN115689721A (en) Credit system information processing method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination