WO2020253353A1 - 预设用户的资源获取资质生成方法及相关设备 - Google Patents
预设用户的资源获取资质生成方法及相关设备 Download PDFInfo
- Publication number
- WO2020253353A1 WO2020253353A1 PCT/CN2020/085847 CN2020085847W WO2020253353A1 WO 2020253353 A1 WO2020253353 A1 WO 2020253353A1 CN 2020085847 W CN2020085847 W CN 2020085847W WO 2020253353 A1 WO2020253353 A1 WO 2020253353A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- reference text
- word
- target
- current type
- weight value
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3334—Selection or weighting of terms from queries, including natural language queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
Definitions
- This application relates to the field of big data analysis, and in particular to a method for generating a preset user's resource acquisition qualification and related equipment.
- the lending agency reviews the loan materials provided by the lender to obtain a pre-loan
- the survey report is used to determine whether the lender’s loan qualifications are qualified; however, the inventor found that the traditional pre-loan survey report requires manual review, which is time-consuming and labor-intensive.
- the materials provided by the lender are too subjective and difficult to reflect the overall actual operation of the lender Status, for example, some merchants may use some unconventional methods to “disguise” their current “official” corporate information when they have problems, resulting in low efficiency and accuracy in reviewing merchant loan qualifications by lending institutions.
- a method, device, device, and computer storage medium for generating a resource acquisition qualification of a preset user are provided.
- a method for generating a preset user's resource acquisition qualification includes the following steps:
- Respectively traverse different types of reference text collections perform semantic analysis on each reference text in the current type reference text collection traversed, and obtain the public opinion index of the corporate information type corresponding to the current type reference text collection according to the semantic analysis result ;
- the current qualifications of the preset users are generated according to the public opinion indexes of different types of enterprise information.
- a device for generating a resource acquisition qualification of a preset user comprising:
- the query module is used to query the official resource acquisition qualification of the preset user
- the obtaining module is used to obtain the reference texts related to the preset user and different types of corporate information from the network information source when the official resource obtaining qualifications are in a normal state, to obtain different types of reference text collections;
- the semantic analysis module is used to respectively traverse different types of reference text collections, perform semantic analysis on each reference text in the current type reference text collection traversed, and obtain the corresponding reference text collection of the current type according to the semantic analysis result Public opinion index of corporate information type;
- the generating module is configured to generate the resource according to the public opinion index of different enterprise information types to obtain the current qualification of the preset user after traversing the different types of reference text collections.
- a device for generating a resource acquisition qualification of a preset user comprising: a memory, a processor, and a resource acquisition qualification generating program of a preset user stored on the memory and running on the processor, and The steps of the preset user's resource acquisition qualification generation program are configured to implement the following methods:
- Respectively traverse different types of reference text collections perform semantic analysis on each reference text in the current type reference text collection traversed, and obtain the public opinion index of the corporate information type corresponding to the current type reference text collection according to the semantic analysis result ;
- the current qualifications of the preset users are generated according to the public opinion indexes of different types of enterprise information.
- a computer storage medium storing a resource acquisition qualification generation program of a preset user, and the resource acquisition qualification generation program of the preset user is configured to implement the steps of the following method:
- Respectively traverse different types of reference text collections perform semantic analysis on each reference text in the current type reference text collection traversed, and obtain the public opinion index of the corporate information type corresponding to the current type reference text collection according to the semantic analysis result ;
- the current qualifications of the preset users are generated according to the public opinion indexes of different types of enterprise information.
- FIG. 1 is a schematic structural diagram of a device for pre-setting user resource acquisition qualification generation in a hardware operating environment involved in a solution of an embodiment of the application;
- FIG. 2 is a schematic flowchart of an embodiment of a method for generating a resource acquisition qualification of a preset user according to this application;
- FIG. 3 is a schematic flowchart of a second embodiment of a method for generating a resource acquisition qualification of a preset user according to this application;
- FIG. 4 is a schematic flowchart of a third embodiment of a method for generating a resource acquisition qualification of a preset user according to this application;
- Fig. 5 is a structural block diagram of a device for generating a resource acquisition qualification of a preset user according to this application.
- FIG. 1 is a schematic structural diagram of a device for generating a resource acquisition qualification of a preset user of a hardware operating environment involved in a solution of an embodiment of the application.
- the device may include a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
- the communication bus 1002 is used to implement connection and communication between these components.
- the user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
- the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
- the memory 1005 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory.
- the memory 1005 may also be a storage device independent of the foregoing processor 1001.
- the device for generating the resource acquisition qualification of the preset user may be a computer host or a smart phone used by a staff member of a lending institution.
- the memory 1005 which is a computer storage medium, may include an operating system, a network communication module, a user receiving module, and a preset user resource acquisition qualification generating program.
- the resource acquisition qualification generation device of the preset user of the present application calls the resource acquisition qualification generation program of the preset user stored in the memory 1005 through the processor 1001, and executes the resource acquisition qualification generation program of the preset user Steps to generate method.
- FIG. 2 is a schematic flowchart of a first embodiment of a method for generating a resource acquisition qualification of a preset user according to the present application.
- the method for generating the resource acquisition qualification of the preset user includes the following steps:
- Step S10 query the official resource acquisition qualification of the preset user
- the execution subject of this embodiment is the above-mentioned device for pre-setting the user's resource acquisition qualification, and the device is loaded with a pre-set user's resource acquisition qualification generating program.
- the loan qualification of the merchant is used as the resource acquisition qualification.
- the resource acquisition qualification generation program of the preset user can be understood as a kind of client, and the first query method of "querying the official resource acquisition qualification of the preset user" is: the server corresponding to the client can Connect with the target database to realize the synchronization update of all preset user information with the State Administration for Industry and Commerce.
- the client can also directly connect with the corresponding target database.
- the target database may be a database under the system of the State Administration for Industry and Commerce.
- the second query method of “inquiring the official resource acquisition qualification of the preset user” is that the device will receive the resource acquisition qualification data transmitted by the preset user, and the device will acquire the resources uploaded by the preset user The qualification is stored in the target database.
- this embodiment takes the above-mentioned first query method as an example for description, that is, when the device detects that it is in communication with the target database, it queries the official loan qualification of the merchant from the target database.
- the staff of the lending institution can directly query the loan merchant’s corporate information registered with the State Administration for Industry and Commerce through the client, as well as the official loan qualification status (such as whether the corporate legal person and the enterprise itself have violated the law, such as whether there are criminal cases, Administrative cases, untrustworthy records, etc.).
- Enterprise information is divided into at least three types: including basic enterprise information (type A), judicial information (type B), and business information (type C).
- the basic information of an enterprise includes enterprise industrial and commercial information.
- the enterprise industrial and commercial information includes the establishment time, operating period, operating status, registered capital, and main business of the enterprise.
- Enterprise judicial information includes enterprise business license information, legal person's employment information in the enterprise, senior management information, major change information, enterprise judicial litigation information, enterprise operation information, enterprise foreign investment relations and concentration information in the industry where the investment enterprise is located.
- Enterprise operation information includes enterprise business license information, legal person's employment information in the enterprise, senior management information, information on major changes, and the company's foreign investment relationship.
- Step S20 When the official resource acquisition qualification is in a normal state, obtain reference texts related to the preset user and different types of enterprise information from the network information source to obtain different types of reference text sets.
- the device finds that the official loan qualification of the merchant is normal from the target database, it does not mean that the merchant’s loan qualification must be okay.
- the results of official loan qualifications are normal, use crawler technology to obtain reference texts related to different types of corporate information of the merchant from network information sources, and obtain a collection of different types of reference texts to compare the loan qualifications of the merchants. For further verification.
- the network information source in this embodiment may be a webpage, a forum, a Weibo, or WeChat, etc.
- these web crawling tools can belong to different search engines, professional forum websites, Weibo websites, WeChat official accounts, etc.; in this embodiment, Python scripts can be used as crawling tools, which can more conveniently and quickly obtain information from network sources. Get the reference text related to the corresponding enterprise information type.
- third type B corporate judicial information
- Step S30 traverse different types of reference text collections respectively, perform semantic analysis on each reference text in the current type reference text collection traversed, and obtain the enterprise information type corresponding to the current type reference text collection according to the semantic analysis result The public opinion index.
- the first type of text collection A, the second type of text collection B, and the third type of text collection C will be traversed separately, and the reference texts in different types of text collections will be semantically analyzed. For example, it can be traversed to a certain type.
- the keywords in each reference text are extracted and analyzed, and some representative words are found from the text to indicate the content of the text. This can greatly reduce the size of the text without significantly losing the content information that the text tends to.
- Step S40 After traversing the different types of reference text collections, generate the current qualifications of the preset users according to the public opinion indexes of different types of enterprise information.
- a total of at least three types of corporate information types of public opinion indexes will be obtained, that is, the basic corporate information (corresponding collection The public opinion index of a), the public opinion index of corporate judicial information (corresponding to set b), and the public opinion index of corporate business information (corresponding to set c).
- This solution can accumulate these three public opinion indexes, and add the accumulated value to a preset The public opinion index is compared. If it is greater than the preset public opinion index, the lender is deemed to have credible lending strength, otherwise the lender will not lend to the lender; wherein the preset public opinion index can be determined by the staff of the lending institution Set by yourself according to expert recommendations.
- the official resource acquisition qualification of the preset user is first inquired; when the official qualification is in a normal state, reference texts corresponding to different enterprise information types of the preset user are obtained from the network information source, and different types of reference texts are obtained Collection; respectively traverse different types of reference text collections, perform semantic analysis on each reference text in the current type reference text collection traversed, and obtain the public opinion index of the enterprise information type corresponding to the current type reference text collection according to the semantic analysis results ; After traversing different types of reference text collections, the current qualifications of the preset users are generated according to the public opinion index of different types of corporate information, thereby enabling the resource supply organization to review the efficiency and accuracy of the resource acquisition users’ resource acquisition qualifications Be improved.
- FIG. 3 is a schematic flow diagram of the second embodiment of a method for generating a resource acquisition qualification of a preset user in this application; based on the first embodiment of the above-mentioned method for generating a resource acquisition qualification of a preset user, this application is proposed A second embodiment of a method for generating a resource acquisition qualification of a preset user.
- step S30 specifically includes:
- Step S301 traverse different types of reference text collections respectively, and perform word segmentation processing on each reference text in the current type reference text collection traversed, so that each reference text in the current type reference text collection has multiple Characteristic words of different parts of speech.
- the method of performing word segmentation processing on each reference text in the current type reference text set traversed so that each reference text in the current type reference text set has multiple feature words with different parts of speech further includes : Determine each sentence in the reference text, convert the text in each sentence into a sequence of Chinese character numbers according to the character frequency; and according to the position of the character in the word, convert the text in each sentence into a corresponding tag sequence;
- the Chinese character number sequence is input sentence by sentence into the word vector conversion layer of the attention model to output a word vector matrix;
- the gradient descent method mini-batch method to block the word vector matrix, and input the block processing result into the attention model to obtain a predicted label sequence, where the attention model includes an encoding layer and a decoding layer Layer; compare the predicted tag sequence with the tag sequence of the preset text corpus in the attention model, synthesize the final segmented sentence (ie target sentence) according to the meaning of each tag, separated by spaces ,
- the words in the sentence after the final word segmentation are feature words, and the part-of-speech tagging operation is performed on each feature word, so that each reference text in the current type reference text set has multiple feature words with different parts of speech; this embodiment
- the word segmentation result can be obtained more quickly and accurately on the longer news text. Compared with the prior art, the word segmentation processing in this embodiment is more efficient.
- Step S302 Analyze the multiple feature words in the current type reference text set, and determine the target feature word belonging to the target preset word category from the multiple feature words.
- the target preset word category is a vocabulary category that can reflect the positive and negative information of the preset user; the developer of the resource acquisition qualification generation program of the preset user in this embodiment will be able to reflect the positive and negative information of the company in advance
- the nouns, verbs, and adjectives of the company are classified into different target preset word categories, and the classified nouns, verbs and adjectives that can reflect the positive and negative information of the company are used as target characteristic words, and the target characteristic words and the
- the mapping relationship of the target preset word category is saved in the vocabulary; at the same time, when step S302 is executed, the characteristic words obtained in step S302 are analyzed, and the part of speech of the characteristic words obtained in step S301 is determined, and then the characteristic The word is matched with the pre-stored feature words in the vocabulary.
- the feature word can be successfully matched with the pre-stored words in the vocabulary, it means that the feature word belongs to the vocabulary that can reflect the positive and negative information of the preset user. Confirm the matching Which target preset word category the pre-stored word belongs to, then the feature word is the target feature word of the confirmed target preset word category.
- Step S303 Calculate the weight value of the target feature word in the current type reference text set.
- the weight value of step S303 is the value of the importance of the target feature word relative to the preset user that can reflect the positive and negative information of the preset user. It is different from the general proportion and reflects more than a certain The percentage of a factor or indicator emphasizes the relative importance of the factor or indicator, and tends to contribute to or importance. The higher the weight value of the target feature word in the current type reference text collection, the more the target feature word can reflect the public opinion of the enterprise.
- the weight value of the target feature word in the current type reference text collection (set a) can be determined by calculating the inverse document frequency of the target feature word in the current type reference text collection (set a) .
- Inverse document frequency is a commonly used weighting technique for information retrieval and information exploration. If certain specified words or phrases appear frequently in an article, but these specified words or phrases rarely appear in other articles, then It is believed that this word or phrase has good classification ability and is suitable for classification.
- step S50 is then executed to process set b and set c in the same way.
- Step S304 Calculate the public opinion index of the enterprise information type corresponding to the current type reference text set according to the weight value of the target feature word.
- the known target feature word must be a vocabulary that can reflect the positive and negative information of a preset user’s corporate information type. If the target feature word has a high weight value in the current type reference text set, then the The higher the positive or negative public opinion index of the target feature word representing the preset user-enterprise information type, the higher the public opinion index of a certain corporate information type of the preset user can be calculated according to the weight value of the target feature word.
- FIG. 4 is a schematic flowchart of a third embodiment of a method for generating a resource acquisition qualification of a preset user in this application; based on the second method embodiment of the above-mentioned method for generating a resource acquisition qualification of a preset user, this A third embodiment of a method for generating a resource acquisition qualification of a preset user is applied.
- the target preset word category includes a first preset word category and a second preset word category, the first preset word category is characterized as a vocabulary reflecting positive information, and the second preset word The category is represented as a vocabulary reflecting negative information;
- the step S302 specifically includes:
- Step S032 Analyze a plurality of the feature words in the current type reference text set, determine the target feature words belonging to the target preset word category from the plurality of feature words, and obtain the first prediction according to the analysis result. Set the first target feature word of the word category and the second target feature word of the second preset word category;
- the target preset word category includes a first preset word category and a second preset word category
- the first preset word category represents a vocabulary reflecting positive information
- the second preset word category Characterize words that reflect negative information.
- step S303 includes:
- Step S033 Calculate the first weight value of the first target feature word in the current type reference text collection; calculate the second weight value of the second target feature word in the current type reference text collection;
- the first weight value of the first target feature word in the current type reference text set can be calculated by the formula (1), formula (2), and formula (3) of the third embodiment, and The second weight value of the second target feature word in the current type reference text set.
- step S304 includes:
- Step S034 The first weight value is compared with the second weight value, and the public opinion index of the enterprise information type corresponding to the current type reference text set is calculated according to the comparison result.
- multiple risk level intervals of the enterprise information type may be obtained first, wherein the risk level interval may be preset by the program developer, and the program developer may pre-set the target enterprise Set multiple risk level intervals, and each risk level interval represents a negative public opinion level, which can be divided into five types of public opinion risk levels: major negative level, general negative level, neutral level, general positive level, and very positive;
- the current public opinion index of the target company is obtained based on the target risk level range.
- the higher the weight value of the first target feature word in the current type reference text collection the more it can reflect the positive public opinion of the company; and the second target feature word in the current type reference text collection
- the higher the weight value the more it can reflect the negative public opinion of the company; you can set the weight value of the first target feature word*50% minus the weight value of the second target feature word*30% to determine which risk the difference is in
- the level interval determines the current public opinion index of an enterprise information type of the preset user according to the risk level interval.
- this application also proposes a device for generating a resource acquisition qualification of a preset user, the device including:
- the query module 10 is used to query the official resource acquisition qualification of the preset user
- the obtaining module 20 is configured to obtain reference texts related to the preset user and different types of enterprise information from the network information source when the official resource obtaining qualifications are in a normal state, and obtain different types of reference text collections;
- the semantic analysis module 30 is configured to respectively traverse different types of reference text collections, perform semantic analysis on each reference text in the current type reference text collection that has been traversed, and obtain the corresponding reference text collection of the current type according to the semantic analysis result Public opinion index of the type of corporate information;
- the generating module 40 is configured to generate the resources according to the public opinion indexes of different types of enterprise information after traversing the different types of reference text collections to obtain the current qualifications of the preset users.
- the device for generating a resource acquisition qualification of a preset user in this embodiment may be a computer application program loaded in the device for generating a resource acquisition qualification of a preset user in the above-mentioned embodiment.
- the device for generating the resource acquisition qualification of the preset user may be a computer host or a smart phone used by the staff of the lending institution.
- this application also provides a computer storage medium that stores a preset user's resource acquisition qualification generation program, and when the preset user's resource acquisition qualification generation program is executed by a processor, the above The steps of the method for obtaining the resources of the preset user to obtain the qualifications.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Finance (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Accounting & Taxation (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Technology Law (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (21)
- 一种预设用户的资源获取资质生成方法,所述方法包括:对预设用户的官方资源获取资质进行查询;在所述官方资质为正常状态时,从网络信息源中分别获取所述预设用户对应不同的企业信息类型的参考文本,得到不同类型的参考文本集合;分别对不同类型的参考文本集合进行遍历,对遍历到的当前类型参考文本集合中的每个参考文本进行语义分析,根据语义分析结果获取所述当前类型参考文本集合对应的企业信息类型的舆情指数;在对所述不同类型的参考文本集合遍历完毕之后,根据不同的企业信息类型的舆情指数生成所述预设用户的当前资质。
- 如权利要求1所述的方法,其中,所述分别对不同类型的参考文本集合进行遍历,对遍历到的当前类型参考文本集合中的每个参考文本进行语义分析,根据语义分析结果获取所述当前类型参考文本集合对应的企业信息类型的舆情指数的步骤,包括:分别对不同类型的参考文本集合进行遍历,对遍历到的当前类型参考文本集合中的每个参考文本进行分词处理,使得所述当前类型参考文本集合中的每个参考文本具有多个不同词性的特征词;对所述当前类型参考文本集合中的多个所述特征词进行分析,从多个所述特征词中确定属于目标预设词类别的目标特征词;计算所述目标特征词在所述当前类型参考文本集合中的权重值;根据所述目标特征词的权重值测算所述当前类型参考文本集合对应的企业信息类型的舆情指数;其中,所述对遍历到的当前类型参考文本集合中的每个参考文本进行分词处理,使得所述当前类型参考文本集合中的每个参考文本具有多个不同词性的特征词,包括:确定所述参考文本中的各个句子,将各个句子中的文字按照字频高低转为汉字数字序列;并按照字在词中的位置,将各个句子中的文字转化为对应的标签序列;将所述汉字数字序列按句输入到注意力模型的字向量转化层中,以输出字向量矩阵;采用梯度下降法对所述字向量矩阵进行分块处理,将分块处理结果输入到所述注意力模型中,得到预测标签序列;将所述预测标签序列与所述注意力模型中的预设文本语料的标签序列进行比对,按每个标签的含义合成目标语句,其中,所述目标语句中的词语即为特征词,并对各个特征词进行词性标注操作,使得所述当前类型参考文本集合中的每个参考文本具有多个不同词性的特征词。
- 如权利要求2所述的方法,其中,所述计算所述目标特征词在所述当前类型参考文本集合中的权重值的步骤,包括:计算所述目标特征词在对应的目标参考文本中的词频,所述目标参考文 本为包含所述目标特征词的参考文本;计算所述目标特征词在所述当前类型参考文本集合中的逆文档频率;根据所述目标特征词的词频以及所述目标特征词的逆文档频率计算所述目标特征词在所述当前类型参考文本集合中的权重值。
- 如权利要求3所述的方法,其中,所述计算所述目标特征词在对应的目标参考文本中的词频的步骤,包括:通过以下公式(一)计算所述目标特征词在对应的目标参考文本中的词频,其中,tf i表示目标特征词T i在所述目标参考文本中的词频,n i表示词语T i在所述目标参考文本中的频次,n k为所述目标参考文本中第k个特征词的频次;所述计算所述目标特征词在所述参考文本集合中的逆文档频率的步骤,包括:通过以下公式(二)计算所述目标特征词在所述当前类型参考文本集合中的逆文档频率,其中,|D|表示所述当前类型参考文本集合中的参考文本的总数量;|d:t i∈d|表示所述当前类型参考文本集合中包括所述目标特征词T i的参考文本的总数量;idf i表示所述目标特征词T i在所述当前类型参考文本集合中的逆文档频率;所述根据所述目标特征词的词频以及所述目标特征词的逆文档频率计算所述目标特征词在参考文本集合中的权重值,包括:通话以下公式(三)计算所述目标特征词在所述当前类型参考文本集合中的权重值,(tf/idf) i=tf i×idf i 公式(三)其中,(tf/idf) i表示目标特征词T i在所述当前类型参考文本集合中的权重值。
- 如权利要求4所述的方法,其中,所述目标预设词类别包括第一预设词类别和第二预设词类别,所述第一预设词类别表征为反映正面信息的词汇,所述第二预设词类别表征为反映负面信息的词汇;所述对所述当前类型参考文本集合中的多个所述特征词进行分析,从多个所述特征词中确定属于目标预设词类别的目标特征词的步骤,包括:对所述当前类型参考文本集合中的多个所述特征词进行分析,从多个所 述特征词中确定属于目标预设词类别的目标特征词,根据分析结果获取属于第一预设词类别的第一目标特征词、以及属于第二预设词类别的第二目标特征词;所述计算所述目标特征词在所述当前类型参考文本集合中的权重值的步骤,包括:计算所述第一目标特征词在所述当前类型参考文本集合中的第一权重值;计算所述第二目标特征词在所述当前类型参考文本集合中的第二权重值;所述根据所述目标特征词的权重值测算所述当前类型参考文本集合对应的企业信息类型的舆情指数的步骤,包括:将所述第一权重值与所述第二权重值进行比较,根据比较结果测算所述当前类型参考文本集合对应的企业信息类型的舆情指数。
- 如权利要求5所述的方法,其中,所述将所述第一权重值与所述第二权重值进行比较,根据比较结果测算所述当前类型参考文本集合对应的企业信息类型的舆情指数的步骤,包括:获取所述企业信息类型的多个风险级别区间;根据所述第一权重值与所述第二权重值之间的差值,从所述多个风险级别区间中获取对应的目标风险级别区间;基于目标风险级别区间获取所述目标企业的当前舆情指数。
- 如权利要求1所述的方法,其中,所述对商家的官方资源获取资质进行查询的步骤,具体包括:在检测到与目标数据库处于通讯状态时,从所述目标数据库中查询商家的官方资源获取资质。
- 一种预设用户的资源获取资质生成设备,所述设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的预设用户的资源获取资质生成程序,所述预设用户的资源获取资质生成程序配置为实现如下步骤:对预设用户的官方资源获取资质进行查询;在所述官方资质为正常状态时,从网络信息源中分别获取所述预设用户对应不同的企业信息类型的参考文本,得到不同类型的参考文本集合;分别对不同类型的参考文本集合进行遍历,对遍历到的当前类型参考文本集合中的每个参考文本进行语义分析,根据语义分析结果获取所述当前类型参考文本集合对应的企业信息类型的舆情指数;在对所述不同类型的参考文本集合遍历完毕之后,根据不同的企业信息类型的舆情指数生成所述预设用户的当前资质。
- 如权利要求8所述的设备,其中,所述分别对不同类型的参考文本集合进行遍历,对遍历到的当前类型参考文本集合中的每个参考文本进行语义分析,根据语义分析结果获取所述当前类型参考文本集合对应的企业信息类型的舆情指数的步骤,包括:分别对不同类型的参考文本集合进行遍历,对遍历到的当前类型参考文本集合中的每个参考文本进行分词处理,使得所述当前类型参考文本集合中的每个参考文本具有多个不同词性的特征词;对所述当前类型参考文本集合中的多个所述特征词进行分析,从多个所述特征词中确定属于目标预设词类别的目标特征词;计算所述目标特征词在所述当前类型参考文本集合中的权重值;根据所述目标特征词的权重值测算所述当前类型参考文本集合对应的企业信息类型的舆情指数;其中,所述对遍历到的当前类型参考文本集合中的每个参考文本进行分词处理,使得所述当前类型参考文本集合中的每个参考文本具有多个不同词性的特征词,包括:确定所述参考文本中的各个句子,将各个句子中的文字按照字频高低转为汉字数字序列;并按照字在词中的位置,将各个句子中的文字转化为对应的标签序列;将所述汉字数字序列按句输入到注意力模型的字向量转化层中,以输出字向量矩阵;采用梯度下降法对所述字向量矩阵进行分块处理,将分块处理结果输入到所述注意力模型中,得到预测标签序列;将所述预测标签序列与所述注意力模型中的预设文本语料的标签序列进行比对,按每个标签的含义合成目标语句,其中,所述目标语句中的词语即为特征词,并对各个特征词进行词性标注操作,使得所述当前类型参考文本集合中的每个参考文本具有多个不同词性的特征词。
- 如权利要求9所述的设备,其中,所述计算所述目标特征词在所述当前类型参考文本集合中的权重值的步骤,包括:计算所述目标特征词在对应的目标参考文本中的词频,所述目标参考文本为包含所述目标特征词的参考文本;计算所述目标特征词在所述当前类型参考文本集合中的逆文档频率;根据所述目标特征词的词频以及所述目标特征词的逆文档频率计算所述目标特征词在所述当前类型参考文本集合中的权重值。
- 如权利要求10所述的设备,其中,所述计算所述目标特征词在对应的目标参考文本中的词频的步骤,包括:通过以下公式(一)计算所述目标特征词在对应的目标参考文本中的词频,其中,tf i表示目标特征词T i在所述目标参考文本中的词频,n i表示词语T i在所述目标参考文本中的频次,n k为所述目标参考文本中第k个特征词的频次;所述计算所述目标特征词在所述参考文本集合中的逆文档频率的步骤,包括:通过以下公式(二)计算所述目标特征词在所述当前类型参考文本集合中的逆文档频率,其中,|D|表示所述当前类型参考文本集合中的参考文本的总数量;|d:t i∈d|表示所述当前类型参考文本集合中包括所述目标特征词T i的参考文本的总数量;idf i表示所述目标特征词T i在所述当前类型参考文本集合中的逆文档频率;所述根据所述目标特征词的词频以及所述目标特征词的逆文档频率计算所述目标特征词在参考文本集合中的权重值,包括:通话以下公式(三)计算所述目标特征词在所述当前类型参考文本集合中的权重值,(tf/idf) i=tf i×idf i 公式(三)其中,(tf/idf) i表示目标特征词T i在所述当前类型参考文本集合中的权重值。
- 如权利要求11所述的设备,其中,所述目标预设词类别包括第一预设词类别和第二预设词类别,所述第一预设词类别表征为反映正面信息的词汇,所述第二预设词类别表征为反映负面信息的词汇;所述对所述当前类型参考文本集合中的多个所述特征词进行分析,从多个所述特征词中确定属于目标预设词类别的目标特征词的步骤,包括:对所述当前类型参考文本集合中的多个所述特征词进行分析,从多个所述特征词中确定属于目标预设词类别的目标特征词,根据分析结果获取属于第一预设词类别的第一目标特征词、以及属于第二预设词类别的第二目标特征词;所述计算所述目标特征词在所述当前类型参考文本集合中的权重值的步骤,包括:计算所述第一目标特征词在所述当前类型参考文本集合中的第一权重值;计算所述第二目标特征词在所述当前类型参考文本集合中的第二权重值;所述根据所述目标特征词的权重值测算所述当前类型参考文本集合对应的企业信息类型的舆情指数的步骤,包括:将所述第一权重值与所述第二权重值进行比较,根据比较结果测算所述当前类型参考文本集合对应的企业信息类型的舆情指数。
- 如权利要求12所述的设备,其中,所述将所述第一权重值与所述第二权重值进行比较,根据比较结果测算所述当前类型参考文本集合对应的企业信息类型的舆情指数的步骤,包括:获取所述企业信息类型的多个风险级别区间;根据所述第一权重值与所述第二权重值之间的差值,从所述多个风险级别区间中获取对应的目标风险级别区间;基于目标风险级别区间获取所述目标企业的当前舆情指数。
- 如权利要求8所述的设备,其中,所述对商家的官方资源获取资质进行查询的步骤,具体包括:在检测到与目标数据库处于通讯状态时,从所述目标数据库中查询商家的官方资源获取资质。
- 一种计算机存储介质,所述计算机存储介质存储有预设用户的资源获取资质生成程序,所述预设用户的资源获取资质生成程序配置为实现如下步骤:对预设用户的官方资源获取资质进行查询;在所述官方资质为正常状态时,从网络信息源中分别获取所述预设用户对应不同的企业信息类型的参考文本,得到不同类型的参考文本集合;分别对不同类型的参考文本集合进行遍历,对遍历到的当前类型参考文本集合中的每个参考文本进行语义分析,根据语义分析结果获取所述当前类型参考文本集合对应的企业信息类型的舆情指数;在对所述不同类型的参考文本集合遍历完毕之后,根据不同的企业信息类型的舆情指数生成所述预设用户的当前资质。
- 如权利要求15所述的计算机存储介质,其中,所述分别对不同类型的参考文本集合进行遍历,对遍历到的当前类型参考文本集合中的每个参考文本进行语义分析,根据语义分析结果获取所述当前类型参考文本集合对应的企业信息类型的舆情指数的步骤,包括:分别对不同类型的参考文本集合进行遍历,对遍历到的当前类型参考文本集合中的每个参考文本进行分词处理,使得所述当前类型参考文本集合中的每个参考文本具有多个不同词性的特征词;对所述当前类型参考文本集合中的多个所述特征词进行分析,从多个所述特征词中确定属于目标预设词类别的目标特征词;计算所述目标特征词在所述当前类型参考文本集合中的权重值;根据所述目标特征词的权重值测算所述当前类型参考文本集合对应的企业信息类型的舆情指数;其中,所述对遍历到的当前类型参考文本集合中的每个参考文本进行分词处理,使得所述当前类型参考文本集合中的每个参考文本具有多个不同词性的特征词,包括:确定所述参考文本中的各个句子,将各个句子中的文字按照字频高低转为汉字数字序列;并按照字在词中的位置,将各个句子中的文字转化为对应的标签序列;将所述汉字数字序列按句输入到注意力模型的字向量转化层中,以输出字向量矩阵;采用梯度下降法对所述字向量矩阵进行分块处理,将分块处理结果输入到所述注意力模型中,得到预测标签序列;将所述预测标签序列与所述注意 力模型中的预设文本语料的标签序列进行比对,按每个标签的含义合成目标语句,其中,所述目标语句中的词语即为特征词,并对各个特征词进行词性标注操作,使得所述当前类型参考文本集合中的每个参考文本具有多个不同词性的特征词。
- 如权利要求16所述的计算机存储介质,其中,所述计算所述目标特征词在所述当前类型参考文本集合中的权重值的步骤,包括:计算所述目标特征词在对应的目标参考文本中的词频,所述目标参考文本为包含所述目标特征词的参考文本;计算所述目标特征词在所述当前类型参考文本集合中的逆文档频率;根据所述目标特征词的词频以及所述目标特征词的逆文档频率计算所述目标特征词在所述当前类型参考文本集合中的权重值。
- 如权利要求17所述的计算机存储介质,其中,所述计算所述目标特征词在对应的目标参考文本中的词频的步骤,包括:通过以下公式(一)计算所述目标特征词在对应的目标参考文本中的词频,其中,tf i表示目标特征词T i在所述目标参考文本中的词频,n i表示词语T i在所述目标参考文本中的频次,n k为所述目标参考文本中第k个特征词的频次;所述计算所述目标特征词在所述参考文本集合中的逆文档频率的步骤,包括:通过以下公式(二)计算所述目标特征词在所述当前类型参考文本集合中的逆文档频率,其中,|D|表示所述当前类型参考文本集合中的参考文本的总数量;|d:t i∈d|表示所述当前类型参考文本集合中包括所述目标特征词T i的参考文本的总数量;idf i表示所述目标特征词T i在所述当前类型参考文本集合中的逆文档频率;所述根据所述目标特征词的词频以及所述目标特征词的逆文档频率计算所述目标特征词在参考文本集合中的权重值,包括:通话以下公式(三)计算所述目标特征词在所述当前类型参考文本集合中的权重值,(tf/idf) i=tf i×idf i 公式(三)其中,(tf/idf) i表示目标特征词T i在所述当前类型参考文本集合中的权重 值。
- 如权利要求18所述的计算机存储介质,其中,所述目标预设词类别包括第一预设词类别和第二预设词类别,所述第一预设词类别表征为反映正面信息的词汇,所述第二预设词类别表征为反映负面信息的词汇;所述对所述当前类型参考文本集合中的多个所述特征词进行分析,从多个所述特征词中确定属于目标预设词类别的目标特征词的步骤,包括:对所述当前类型参考文本集合中的多个所述特征词进行分析,从多个所述特征词中确定属于目标预设词类别的目标特征词,根据分析结果获取属于第一预设词类别的第一目标特征词、以及属于第二预设词类别的第二目标特征词;所述计算所述目标特征词在所述当前类型参考文本集合中的权重值的步骤,包括:计算所述第一目标特征词在所述当前类型参考文本集合中的第一权重值;计算所述第二目标特征词在所述当前类型参考文本集合中的第二权重值;所述根据所述目标特征词的权重值测算所述当前类型参考文本集合对应的企业信息类型的舆情指数的步骤,包括:将所述第一权重值与所述第二权重值进行比较,根据比较结果测算所述当前类型参考文本集合对应的企业信息类型的舆情指数。
- 如权利要求19所述的计算机存储介质,其中,所述将所述第一权重值与所述第二权重值进行比较,根据比较结果测算所述当前类型参考文本集合对应的企业信息类型的舆情指数的步骤,包括:获取所述企业信息类型的多个风险级别区间;根据所述第一权重值与所述第二权重值之间的差值,从所述多个风险级别区间中获取对应的目标风险级别区间;基于目标风险级别区间获取所述目标企业的当前舆情指数。
- 如权利要求15所述的计算机存储介质,其中,所述对商家的官方资源获取资质进行查询的步骤,具体包括:在检测到与目标数据库处于通讯状态时,从所述目标数据库中查询商家的官方资源获取资质。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910540026.8A CN110377809A (zh) | 2019-06-19 | 2019-06-19 | 预设用户的资源获取资质生成方法及相关设备 |
CN201910540026.8 | 2019-06-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020253353A1 true WO2020253353A1 (zh) | 2020-12-24 |
Family
ID=68250598
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/085847 WO2020253353A1 (zh) | 2019-06-19 | 2020-04-21 | 预设用户的资源获取资质生成方法及相关设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110377809A (zh) |
WO (1) | WO2020253353A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113554411A (zh) * | 2021-06-28 | 2021-10-26 | 北京来也网络科技有限公司 | 结合rpa和ai的企业资质申报的处理方法及装置 |
CN114299520A (zh) * | 2021-12-29 | 2022-04-08 | 福建亿力电力科技有限责任公司 | 一种基于双模型融合的供应商资质审核方法和审核装置 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377809A (zh) * | 2019-06-19 | 2019-10-25 | 深圳壹账通智能科技有限公司 | 预设用户的资源获取资质生成方法及相关设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107229612A (zh) * | 2017-05-24 | 2017-10-03 | 重庆誉存大数据科技有限公司 | 一种网络信息语义倾向分析方法及系统 |
CN107463616A (zh) * | 2017-07-03 | 2017-12-12 | 上海凡响网络科技有限公司 | 一种企业信息分析方法及系统 |
CN107688594A (zh) * | 2017-05-05 | 2018-02-13 | 平安科技(深圳)有限公司 | 基于社交信息的风险事件的识别系统及方法 |
WO2019024496A1 (zh) * | 2017-08-04 | 2019-02-07 | 平安科技(深圳)有限公司 | 企业推荐方法及应用服务器 |
CN110377809A (zh) * | 2019-06-19 | 2019-10-25 | 深圳壹账通智能科技有限公司 | 预设用户的资源获取资质生成方法及相关设备 |
-
2019
- 2019-06-19 CN CN201910540026.8A patent/CN110377809A/zh active Pending
-
2020
- 2020-04-21 WO PCT/CN2020/085847 patent/WO2020253353A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107688594A (zh) * | 2017-05-05 | 2018-02-13 | 平安科技(深圳)有限公司 | 基于社交信息的风险事件的识别系统及方法 |
CN107229612A (zh) * | 2017-05-24 | 2017-10-03 | 重庆誉存大数据科技有限公司 | 一种网络信息语义倾向分析方法及系统 |
CN107463616A (zh) * | 2017-07-03 | 2017-12-12 | 上海凡响网络科技有限公司 | 一种企业信息分析方法及系统 |
WO2019024496A1 (zh) * | 2017-08-04 | 2019-02-07 | 平安科技(深圳)有限公司 | 企业推荐方法及应用服务器 |
CN110377809A (zh) * | 2019-06-19 | 2019-10-25 | 深圳壹账通智能科技有限公司 | 预设用户的资源获取资质生成方法及相关设备 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113554411A (zh) * | 2021-06-28 | 2021-10-26 | 北京来也网络科技有限公司 | 结合rpa和ai的企业资质申报的处理方法及装置 |
CN114299520A (zh) * | 2021-12-29 | 2022-04-08 | 福建亿力电力科技有限责任公司 | 一种基于双模型融合的供应商资质审核方法和审核装置 |
Also Published As
Publication number | Publication date |
---|---|
CN110377809A (zh) | 2019-10-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210224693A1 (en) | Systems and Methods for Predictive Coding | |
US20170235820A1 (en) | System and engine for seeded clustering of news events | |
US8103678B1 (en) | System and method for establishing relevance of objects in an enterprise system | |
US20130218644A1 (en) | Determination of expertise authority | |
US20200250212A1 (en) | Methods and Systems for Searching, Reviewing and Organizing Data Using Hierarchical Agglomerative Clustering | |
CN111651552B (zh) | 结构化信息确定方法、装置和电子设备 | |
US11775504B2 (en) | Computer estimations based on statistical tree structures | |
WO2020253353A1 (zh) | 预设用户的资源获取资质生成方法及相关设备 | |
CN113590945B (zh) | 一种基于用户借阅行为-兴趣预测的图书推荐方法和装置 | |
CA2956627A1 (en) | System and engine for seeded clustering of news events | |
CN115630144B (zh) | 一种文档搜索方法、装置及相关设备 | |
Zheng et al. | Algorithm for recommending answer providers in community-based question answering | |
CN115374354A (zh) | 基于机器学习的科技服务推荐方法、装置、设备及介质 | |
CN116109373A (zh) | 金融产品的推荐方法、装置、电子设备和介质 | |
CN110532229B (zh) | 证据文件检索方法、装置、计算机设备和存储介质 | |
CN118396786A (zh) | 合同文档审核方法和装置、电子设备及计算机可读存储介质 | |
CN111126073B (zh) | 语义检索方法和装置 | |
Liang et al. | Detecting novel business blogs | |
US9069858B1 (en) | Systems and methods for identifying entity mentions referencing a same real-world entity | |
CN115062110A (zh) | 文本处理方法、装置、电子设备和介质 | |
CN113672703A (zh) | 一种用户信息的更新方法、装置、设备及存储介质 | |
CN113095078A (zh) | 关联资产确定方法、装置和电子设备 | |
CN113849618A (zh) | 基于知识图谱的策略确定方法、装置、电子设备及介质 | |
US10824681B2 (en) | Enterprise resource textual analysis | |
CN113177116B (zh) | 信息展示方法及装置、电子设备、存储介质及程序产品 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20826085 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20826085 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 29/03/2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20826085 Country of ref document: EP Kind code of ref document: A1 |