CN116681056B - Text value calculation method and device based on value scale - Google Patents

Text value calculation method and device based on value scale Download PDF

Info

Publication number
CN116681056B
CN116681056B CN202310596067.5A CN202310596067A CN116681056B CN 116681056 B CN116681056 B CN 116681056B CN 202310596067 A CN202310596067 A CN 202310596067A CN 116681056 B CN116681056 B CN 116681056B
Authority
CN
China
Prior art keywords
node
core
preset
word
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310596067.5A
Other languages
Chinese (zh)
Other versions
CN116681056A (en
Inventor
张勇东
毛震东
刘毅
郭俊波
陈伟东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Konami Sports Club Co Ltd
Original Assignee
University of Science and Technology of China USTC
People Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC, People Co Ltd filed Critical University of Science and Technology of China USTC
Priority to CN202310596067.5A priority Critical patent/CN116681056B/en
Publication of CN116681056A publication Critical patent/CN116681056A/en
Application granted granted Critical
Publication of CN116681056B publication Critical patent/CN116681056B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a text value calculation method and a text value calculation device based on a value table, wherein the method comprises the following steps: word segmentation is carried out on the text to obtain a keyword set containing a plurality of keywords; traversing the keyword set based on a preset value table, and inquiring node keywords matched with the keywords to obtain matched node sets with different levels; the preset value table comprises a plurality of preset level nodes; each node includes a node key; and calculating the value data of the text according to the number and the weight of the matched node sets of different levels. And segmenting the text, determining matching node sets of different levels contained in the text by matching keywords in the text with node keywords in a preset value table, and further calculating to obtain value data of the text according to the number and the weight of the matching node sets of different levels, so as to determine the value of the text based on the preset value table.

Description

Text value calculation method and device based on value scale
Technical Field
The embodiment of the invention relates to the technical field of artificial intelligence, in particular to a text value calculation method and device based on a value table.
Background
Along with the development of science and technology, the self-media is different from the traditional media ecology, and the traditional media ecology is mainly produced and released by a professional main body, and the information has the characteristics of higher public confidence, strict content management and the like. From the media age, anyone can create and publish content through the internet, so that the quality of information propagated in the network is seriously lack of guarantee. The content of each media platform is good and bad, and a large amount of content with low value orientation exists. Because of low production cost and low acceptance threshold of the content, a large amount of low-value content exists in the network, and the low-value content is easy to overstretch, so that the propagation of the main stream value content is challenging. If the low-value content freely grows without guidance, useless, bad and other information can be flooded in the network to pollute the network space, and the social wind can be negatively influenced, so that the value of the public is silently brought.
The existing network information guiding method mainly comprises rumor detection, public opinion monitoring, standard formulation, popularity prediction and the like. The main purpose of the above methods is to identify counterfeit information, monitor the development situation of hot events, etc. If standards are formulated, the content and form of network information are explicitly released by formulating relevant standards and specifications, so that the information release and the information release are managed and guided, but the method is more inscribed and lacks flexibility. Among the information popularity predictions, it is generally believed that information having a greater popularity tends to be of greater value, but this deviates from the actual existence, such as the crowd's favour, inexpensive low value information sometimes being rather easier to stream. Therefore, it is necessary to perform value calculation from the value layer to the text of the web content, not only to the content focusing on the one-sidedness such as forgery or hot spots.
Disclosure of Invention
In view of the foregoing, embodiments of the present invention are provided to provide a method and apparatus for calculating a value table-based text value that overcomes or at least partially solves the foregoing problems.
According to an aspect of the embodiment of the present invention, there is provided a text value calculation method based on a value table, including:
word segmentation is carried out on the text to obtain a keyword set containing a plurality of keywords;
traversing the keyword set based on a preset value table, and inquiring node keywords matched with the keywords to obtain matched node sets with different levels; the preset value table comprises a plurality of preset level nodes; each node includes a node key;
and calculating the value data of the text according to the number and the weight of the matched node sets of different levels.
According to another aspect of the embodiment of the present invention, there is provided a text value calculating apparatus based on a value table, the apparatus including:
the word segmentation module is suitable for carrying out word segmentation processing on the text to obtain a keyword set containing a plurality of keywords;
the matching module is suitable for traversing the keyword set based on a preset value table, inquiring node keywords matched with the keywords, and obtaining matching node sets with different levels; the preset value table comprises a plurality of preset level nodes; each node includes a node key;
And the value calculation module is suitable for calculating and obtaining the value data of the text according to the number and the weight of the matched node sets of different levels.
According to yet another aspect of an embodiment of the present invention, there is provided a computing device including: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the text value calculation method based on the value table.
According to still another aspect of the embodiments of the present invention, there is provided a computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the above-described value table-based text value calculation method.
According to the text value calculating method and device based on the value table, the text is segmented, the matching node sets of different levels contained in the text are determined by matching the keywords in the text with the node keywords in the preset value table, and then the value data of the text is calculated according to the number and the weight of the matching node sets of different levels, so that the text value is determined based on the preset value table.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and may be implemented according to the content of the specification, so that the technical means of the embodiments of the present invention can be more clearly understood, and the following specific implementation of the embodiments of the present invention will be more apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 illustrates a flow chart of a value scale based text value calculation method according to one embodiment of the invention;
FIG. 2 shows a flow chart for updating a preset value table;
FIG. 3 illustrates a schematic diagram of a value meter-based text value computing device, according to one embodiment of the invention;
FIG. 4 illustrates a schematic diagram of a computing device, according to one embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
FIG. 1 shows a flow chart of a value scale based text value calculation method according to one embodiment of the invention, as shown in FIG. 1, comprising the steps of:
step S101, word segmentation processing is carried out on the text, and a keyword set containing a plurality of keywords is obtained.
According to the method, the text value is calculated by analyzing various texts issued by a user in a network, and the matching degree of the texts and the main stream value is calculated, so that the sense-of-society guidance is maintained as a fundamental basis, and the correct cognition and accurate propagation of the main stream value content are ensured.
Specifically, after the text is obtained, the text is preprocessed, where the preprocessing includes, for example, format filtering processing and stop word filtering processing. Various formatted information and words with no value which are irrelevant to text value calculation can be removed through preprocessing, words irrelevant to the value calculation are reduced, and the accuracy of subsequent word segmentation is ensured. Such as the date in the text, the "electricity per newspaper" in news, URL, etc. For stop words, a stop word list can be preset, stop word filtering processing is carried out according to the preset stop word list, and the preset stop word list comprises words or symbols with no value meaning, such as "@", and "(" the time of the stop word ) "," emmmm ", etc. The above formatted information and the preset stop word list are exemplified, and may be specifically set according to the implementation situation, which is not limited herein.
The text after pretreatment is processed according to punctuation marks, for example, the text is firstly split into a plurality of sentences according to the punctuation marks, then each sentence is subjected to word segmentation processing to obtain each phrase contained in each sentence, and the word segmentation processing can be carried out by using NER (Named Entity Recognition) tools such as natural language processing to carry out word segmentation to obtain phrases such as 'human', 'fortune', 'community', and the like. Furthermore, each phrase obtained based on word segmentation processing is segmented into sentences, and the association relation among the phrases is not considered, so that the embodiment also combines each phrase based on a preset expansion word list to obtain corresponding keywords to form a keyword set. The preset expansion word list is set according to implementation conditions.
The keyword set contains a plurality of keywords obtained from the text, and subsequent text value calculation is performed based on the keywords.
Step S102, traversing the keyword set based on a preset value table, and inquiring node keywords matched with the keywords to obtain matched node sets with different levels.
The preset value table can be preset, and a hierarchical label semantic knowledge graph mode is adopted, wherein the hierarchical label semantic knowledge graph mode comprises a plurality of preset level nodes which are core nodes, secondary core nodes and peripheral nodes in sequence. The value of the core node is higher than that of the secondary core node, and the value of the secondary core node is higher than that of the peripheral node. The division of the core node, the secondary core node and the peripheral node is set according to the implementation situation, and is determined by combining the current main stream value, and the method is not limited herein. Each node includes node keywords and also includes, for example, node frequency, related and similar nodes, node numbers, entity types, and the like. The node number can conveniently and rapidly search and locate the node to which the node keyword belongs, the node number corresponds to the node level, if the node number starts with the A, namely the core node, the node number starts with the B, namely the secondary core node, the node number starts with the C, namely the peripheral node and the like, the above is exemplified, and various information of the node, such as the node frequency, the related node, the similar node and the like, can be correspondingly returned according to the query according to the implementation condition. The degree of the node (i.e., the total number of related nodes and similar nodes) can also be accumulated based on the number of related nodes and similar nodes returned. Here, the related nodes and similar nodes returned can be queried according to the node keywords, the secondary related nodes and secondary similar nodes of the original node can be searched correspondingly according to the related nodes and similar nodes (i.e. the related nodes and similar nodes are used as query words to query a preset value table, the related nodes and similar nodes of the query words are obtained), the specific query can be selected to query once according to the implementation condition or can be queried for multiple times according to the query results, and the specific query is not limited herein.
After the keyword set is obtained, the keyword set can be traversed, a preset value table is queried for any keyword contained in the keyword set, and node keywords matched with the keywords are obtained from the keyword set, namely whether the corresponding node keywords exist in the preset value table or not is queried according to the keywords, if yes, the node keywords are classified according to the level of the affiliated node, and the matched node sets of different levels are obtained. The matching node set comprises a core node set, a secondary core node set and a peripheral node set. If the node number of the node keyword is AXXXX, the level of the node to which the node keyword belongs can be determined according to the node number, and the node keyword is classified as a core node set. If the preset value table is queried according to the keywords, the node keywords matched with the keywords are not obtained, the keywords in the keyword set can be classified into a non-value matching node set, and the non-value matching node set is not used for calculating the text value. Here, the keywords contained in each of the core node set, the secondary core node set, the peripheral node set, and the non-value matching node set are not repeated.
Further, the preset value table may be preset, and may also be updated according to a new text, as shown in fig. 2:
Step S201, splitting the first text into a plurality of sentences, performing first word segmentation on the sentences, and acquiring part-of-speech information, grammar dependency relationship and semantic dependency relationship information of each first word segmentation.
Corresponding to any new text (hereinafter referred to as a first text), splitting the text into a plurality of sentences, performing first word segmentation processing on each sentence, for example, performing first word segmentation processing by using hanlp (Han Language Processing, chinese language processing package), and performing word segmentation, part-of-speech tagging, entity recognition and the like, so as to obtain each first word of the sentence, part-of-speech information, grammar dependency relationship and semantic dependency relationship information of the first word. If the first text is split to obtain multiple sentences, D= { s i ,i=1,The composition of the present invention is a combination of 2, a.n., (s is therein i Represents the ith sentence in the first text, N represents the total number of sentences in the first text, s i ={w j J=1, 2, V }, where w j Representative sentence s i V represents the total number of first partial words.
Step S202, extracting to-be-processed segmented words according to part-of-speech information, grammar dependency relation and semantic dependency relation information of each first segmented word, and filtering to-be-processed segmented words to obtain to-be-processed segmented word sets.
According to each first word division w j Part-of-speech information, grammatical dependency, semantic dependency information, may be used to count each first word segment w j According to the byte size, performing sliding window operation with the size of n, and extracting to obtain the word to be processed. The word segmentation to be processed adopts a n gram mode.
Further, after the word to be processed is obtained, filtering processing is carried out on the word to be processed, wherein the filtering processing comprises stopping word filtering, digital filtering, low-frequency character name filtering, digital word filtering, part-of-speech filtering and keyword filtering, a filtering list can be set during the filtering, and daily commonly used words are removed according to the filtering list, so that new words can be found out more quickly and used for updating a preset value table. And filtering to obtain a word segmentation set to be processed.
Step S203, extracting word segmentation feature sets of word segmentation sets to be processed, core feature sets of core node keywords of a preset value table, sub-core feature sets of sub-core node keywords and peripheral feature sets of peripheral node keywords based on a preset model; calculating according to the number of word feature sets, core feature sets and core node keywords to obtain core similarity of each word in the word set to be processed, calculating according to the number of word feature sets, sub-core feature sets and sub-core node keywords to obtain sub-core similarity of each word in the word set to be processed, and calculating according to the number of word feature sets, peripheral feature sets and peripheral node keywords to obtain peripheral similarity of each word in the word set to be processed.
For the word set to be processed, O= { n 1 ,n 2 ,...,n i ,...,n m Wherein m represents the total number of words to be processed of the obtained ngram, and a preset model, such as a self-coding language model of a pre-trained BERT and the like, can be adopted to extract and obtain a word segmentation feature set of the word segmentation set O to be processed, f O ={f Oi ,i=1,2,...,m},f O ∈R m×d ,R m×d A real space with m x d dimension, and d is a characteristic dimension. Wherein f Oi The acquisition of (c) can be based on the following formula:
f Oi =LM(n i )
wherein LM represents a preset model, n i For the ith word segment in the word segment set to be processed, f Oi Is n i Is a word segmentation feature of (a). Correspondingly, according to the formula, the core feature set f of each core node keyword of the preset value table can be obtained by utilizing the preset model A Sub-core feature set f of individual sub-core node keywords B Peripheral feature set f of individual peripheral node keywords C
Obtaining word segmentation feature set f O Core feature set f of core node keywords of preset value table A Sub-core feature set f of sub-core node keywords B Peripheral feature set f of peripheral node keywords C Thereafter, each similarity can be calculated based on each feature set, and the feature set f is segmented O Core feature set f A And calculating the number of key words of the core nodes to obtain core similarity, and acquiring a feature set f according to the word segmentation O Sub-core feature set f B Calculating the number of secondary core node keywords to obtain secondary core similarity, and according to the word segmentation feature set f O Peripheral feature set f C And calculating the number of the peripheral node keywords to obtain peripheral similarity, specifically taking the core similarity as an example, referring to the following formula:
wherein, A is the number of core node keywords, f Aj Is the word segmentation characteristic of the j-th core node keyword, T is a transposition function, sim A Is the core similarity. Correspondingly, according to the above formula, the word segmentation feature set f can be used O Sub-core feature set f B Calculating the number of secondary core node keywords to obtain secondary core similarity sim B According to the word segmentation feature set f O Peripheral feature set f C Calculating the number of peripheral node keywords to obtain peripheral similarity sim C 。sim A 、sim B 、sim C The range of the value of (2) is 0-1.
Step S204, traversing the word segmentation set to be processed, comparing the core similarity of the word segmentation with a preset core threshold value for any word segmentation, and judging whether the core similarity is larger than or equal to the preset core threshold value.
After the core similarity, the secondary core similarity and the peripheral similarity of each word in the word segmentation set to be processed are obtained through calculation, traversing the word segmentation set to be processed, and aiming at any word, firstly, carrying out core similarity sim of the word segmentation A Comparing with a preset core threshold value, if sim A If the score is greater than or equal to the preset core threshold, step S207 is executed to add the score to the preset value table, if sim A =1, indicating that the segmentation has been preset for a value scale, without addition. If the core similarity sim A And less than the preset core threshold, step S205 is performed.
Step S205, comparing the sub-core similarity of the word segmentation with a preset sub-core threshold value, and judging whether the sub-core similarity is larger than or equal to the preset sub-core threshold value.
If the core similarity sim A If the similarity is smaller than the preset core threshold value, the sub-core similarity sim of the word is further processed B Comparing with a preset sub-core threshold, if sim B If the score is greater than or equal to the preset sub-core threshold, step S207 is executed to add the score to the preset value table, if sim B =1, indicating that the segmentation has been preset for a value scale, without addition.If the sub-core similarity sim B And is smaller than the preset secondary core threshold, step S206 is performed.
Step S206, comparing the peripheral similarity of the word segmentation with a preset peripheral threshold value, and judging whether the peripheral similarity is larger than or equal to the preset peripheral threshold value.
If the sub-core similarity sim B If the peripheral similarity sim of the word is smaller than the secondary preset core threshold value, the peripheral similarity sim of the word is further processed C Comparing with a preset peripheral threshold value, if sim C If the value is greater than or equal to the preset peripheral threshold, step S207 is executed to add the word segment to the preset value table, if sim C =1, indicating that the segmentation has been preset for a value scale, without addition. If the peripheral similarity sim C If the value of the word is smaller than the preset peripheral threshold value, the word is not in accordance with the requirement of the preset value table, the word does not belong to the main stream value, and the word is discarded. After discarding the word, the word segmentation set to be processed can be traversed to obtain the next word segment, and the core similarity, the secondary core similarity and the peripheral similarity of the next word segment are judged until all the word segments in the word segmentation set to be processed are traversed, so that the updating of the preset value table is completed.
Step S207, adding the segmentation into a preset value list.
When the core similarity is judged to be greater than or equal to a preset core threshold, or the secondary core similarity is judged to be greater than or equal to a preset secondary core threshold, or the peripheral similarity is judged to be greater than or equal to a preset peripheral threshold, the segmentation can be added into a preset value table, and corresponding core node keywords, secondary core node keywords, peripheral node keywords and the like can be correspondingly added according to judgment conditions. After the word segmentation is added into the preset value list, traversing the word segmentation set to be processed to obtain the next word segmentation, and judging the core similarity, the secondary core similarity and the peripheral similarity of the next word segmentation until traversing all the word segmentation in the word segmentation set to be processed, and finishing updating the preset value list.
Step S103, calculating to obtain the value data of the text according to the number and the weight of the matched node sets of different levels.
After the matching node sets are obtained, the value data of the text can be calculated according to the number of keywords respectively contained in the matching node sets of different levels and the weights corresponding to the matching node sets of different levels. Specifically, according to the matched node sets, a first product of the number of the core node sets and the weight of the core node, a second product of the number of the secondary core node sets and the weight of the secondary core node, a third product of the number of the peripheral node sets and the weight of the peripheral node, and a fourth product of the number of the keyword sets and the weight of the core node are calculated respectively, the first product, the second product and the third product are accumulated, and the ratio of the accumulated result to the fourth product is calculated, specifically referring to the following formula:
wherein |A| in the formula (1) is the number of core node sets, |B| is the number of secondary core node sets, |C| is the number of peripheral node sets, |S| is the number of keyword sets, and alpha '' A As the weight of the core node, alpha' B For secondary core node weights, α' C And v is the value intermediate data of the text for the peripheral node weight.
Considering that certain non-valuable keywords may exist in the keyword set obtained by word segmentation, so that the non-valuable matching node set contains too many keywords, and the calculated value intermediate data v of the text is smaller, the embodiment corrects the value according to the preset index contrast value to obtain the value data of the text, and the following formula is referred to:
v′=v 0.3 (2)
wherein v 'in the formula (2) is the value data of the text, the preset index is 0.3, and the value data v' of the text after correction is obtained by stretching v by using a power function. Based on the calculation, if the matched node sets obtained by matching the keyword sets are core node sets, the value data v '=1 of the obtained text, and if the matched node sets are non-value matched node sets, the value data v' of the text is determined to be 0.
Further, the calculation of each weight is specifically as follows: the core node weight is obtained by carrying out normalization processing on the first sum value of each node keyword in the core node set; the first sum value is obtained according to the sum of the product of the number of related nodes and similar nodes of each node keyword in the accumulated core node set and preset weights and the node frequency of the node keywords; the secondary core node weight is obtained by carrying out normalization processing on the second sum value of each node keyword in the secondary core node set; the second sum value is obtained according to the sum of the product of the number of related nodes and similar nodes of each node keyword in the accumulated secondary core node set and preset weights and the node frequency of the node keywords; the peripheral node weight is obtained by carrying out normalization processing according to the third sum value of the key words of each node in the peripheral node set; the third sum value is obtained according to the sum of the product of the number of related nodes and similar nodes of each node keyword in the accumulated peripheral node set and the preset weight and the node frequency of the node keywords, and the specific reference is as follows:
α′ A =softmax(∑ x∈A [f x +λd x ]) (3)
Wherein, alpha 'in formula (3)' A Weight of core node, d x Representing the number of related nodes and similar nodes of each node keyword in the core node set, wherein A is the core node set, the value range of x is the core node set, and f x And (3) representing node frequencies of key words of all nodes in the core node set, wherein lambda is a preset weight, and softmax is a normalization function.
α′ B =softmax(∑ x∈B [f x +λd x ]) (4)
Wherein, alpha 'in formula (4)' B Weight for secondary core node, d x Representing the number of related nodes and similar nodes of each node keyword in the secondary core node set, wherein B is the secondary core node set, the value range of x is the secondary core node set, and f x And expressing the node frequency of each node keyword in the secondary core node set, wherein lambda is a preset weight, and softmax is a normalization function.
α′ C =softmax(∑ x∈C [f x +λd x ]) (5)
Wherein, alpha 'in formula (5)' C Weighting peripheral nodes, d x Representing the number of related nodes and similar nodes of each node keyword in the peripheral node set, wherein C is the peripheral node set, the value range of x is the peripheral node set, and f x And expressing node frequencies of key words of all nodes in the peripheral node set, wherein lambda is a preset weight for balancing the scale difference between the number of related nodes and similar nodes and the node frequencies, and specifically, the method is set according to the implementation condition, and softmax is a normalization function.
Each weight is determined according to various attribute information of each keyword in the matched node set of different levels in a preset value table, such as the number of related nodes and similar nodes and the node frequency, and the higher the node frequency in the corresponding preset value table is, the larger the value data is, namely the larger the weight is; if the number of the corresponding related nodes and the number of the similar nodes are larger, the keyword is indicated to belong to an important hub in the preset value table, and the weight of the keyword is larger.
According to the text value calculation method based on the value table, the text is segmented, the matching node sets of different levels contained in the text are determined by matching the keywords in the text with the node keywords in the preset value table, and then the value data of the text is calculated according to the number and the weight of the matching node sets of different levels, so that the text value is determined based on the preset value table.
Fig. 3 shows a schematic structural diagram of a text value calculating device based on a value table according to an embodiment of the present invention. As shown in fig. 3, the apparatus includes:
the word segmentation module 310 is adapted to perform word segmentation on the text to obtain a keyword set containing a plurality of keywords;
The matching module 320 is adapted to traverse the keyword set based on a preset value table, query the node keywords matched with the keywords, and obtain matching node sets of different levels; the preset value table comprises a plurality of preset level nodes; each node includes a node key;
the value calculation module 330 is adapted to calculate the value data of the text according to the number and the weight of the matched node sets with different levels.
Optionally, presetting the plurality of level nodes includes: core nodes, secondary core nodes and peripheral nodes; each node further comprises: node number, node frequency, related nodes, and similar nodes.
Optionally, the matching module 320 is further adapted to:
traversing the keyword set, and inquiring a preset value table aiming at any keyword to obtain node keywords matched with the keywords;
classifying the node keywords according to the levels of the nodes to which the node keywords belong to obtain matching node sets of different levels; the matching node set comprises a core node set, a secondary core node set and a peripheral node set.
Optionally, the value calculation module 330 is further adapted to:
calculating to obtain a first product of the number of the core node sets and the weight of the core node, a second product of the number of the secondary core node sets and the weight of the secondary core node, a third product of the number of the peripheral node sets and the weight of the peripheral node, and a fourth product of the number of the keyword sets and the weight of the core node; the core node weight is obtained by carrying out normalization processing on the first sum value of each node keyword in the core node set; the first sum value is obtained according to the sum of the product of the number of related nodes and similar nodes of each node keyword in the accumulated core node set and preset weights and the node frequency of the node keywords; the secondary core node weight is obtained by carrying out normalization processing on the second sum value of each node keyword in the secondary core node set; the second sum value is obtained according to the sum of the product of the number of related nodes and similar nodes of each node keyword in the accumulated secondary core node set and preset weights and the node frequency of the node keywords; the peripheral node weight is obtained by carrying out normalization processing according to the third sum value of the key words of each node in the peripheral node set; the third sum value is obtained according to the sum of the product of the number of related nodes and similar nodes of each node keyword in the accumulated peripheral node set and preset weights and the node frequency of the node keywords;
And accumulating the first product, the second product and the third product, calculating the ratio of the accumulated result to the fourth product, and correcting the ratio according to a preset index to obtain the value data of the text.
Optionally, the apparatus further comprises: the non-matching module 340 is adapted to classify the keywords into a non-value matching node set if the preset value table is queried and the node keywords matching the keywords are not obtained.
Optionally, the apparatus further comprises: the non-matching value module 350 is adapted to determine that the value data of the text is 0 if the set of matching nodes is a set of non-value matching nodes.
Optionally, the word segmentation module 310 is further adapted to:
preprocessing the text; the preprocessing comprises format filtering processing and stop word filtering processing;
processing the text according to punctuation marks, and splitting the text into a plurality of sentences;
word segmentation processing is carried out on each sentence, and each phrase contained in each sentence is obtained;
and combining the phrases based on a preset expansion word list to obtain corresponding keywords to form a keyword set.
Optionally, the apparatus further comprises: the updating module 360 is adapted to split the first text into a plurality of sentences, perform a first word segmentation process on the plurality of sentences, and acquire part-of-speech information, grammar dependency relationship and semantic dependency relationship information of each first word; extracting to-be-processed word segments according to part-of-speech information, grammar dependency relation and semantic dependency relation information of each first word segment, and filtering to-be-processed word segments to obtain a to-be-processed word segment set; the filtering processing comprises stop word filtering, digital filtering, low-frequency character name filtering, digital word quantity filtering, part-of-speech filtering and keyword filtering; extracting word segmentation feature sets of word segmentation sets to be processed, core feature sets of core node keywords of a preset value table, secondary core feature sets of secondary core node keywords and peripheral feature sets of peripheral node keywords based on a preset model; calculating according to the number of word feature sets, core feature sets and core node keywords to obtain core similarity of each word in the word set to be processed, calculating according to the number of word feature sets, sub-core feature sets and sub-core node keywords to obtain sub-core similarity of each word in the word set to be processed, and calculating according to the number of word feature sets, peripheral feature sets and peripheral node keywords to obtain peripheral similarity of each word in the word set to be processed; traversing a word segmentation set to be processed, comparing the core similarity of the word segments with a preset core threshold value aiming at any word segment, and adding the word segments into a preset value table if the core similarity is greater than or equal to the preset core threshold value; if the core similarity is smaller than a preset core threshold, comparing the sub-core similarity of the segmented words with a preset sub-core threshold, and if the sub-core similarity is larger than or equal to the preset sub-core threshold, adding the segmented words into a preset value table; if the secondary core similarity is smaller than a preset secondary core threshold, comparing the peripheral similarity of the segmented words with a preset peripheral threshold, and if the peripheral similarity is larger than or equal to the preset peripheral threshold, adding the segmented words into a preset value table.
The above descriptions of the modules refer to the corresponding descriptions in the method embodiments, and are not repeated herein.
The embodiment of the invention also provides a nonvolatile computer storage medium, and the computer storage medium stores at least one executable instruction which can execute the text value calculation method based on the value table in any method embodiment.
FIG. 4 illustrates a schematic diagram of a computing device, according to an embodiment of the invention, the particular embodiment of which is not limiting of the particular implementation of the computing device.
As shown in fig. 4, the computing device may include: a processor 402, a communication interface (Communications Interface) 404, a memory 406, and a communication bus 408.
Wherein:
processor 402, communication interface 404, and memory 406 communicate with each other via communication bus 408.
A communication interface 404 for communicating with network elements of other devices, such as clients or other servers.
Processor 402 is configured to execute program 410, and may specifically perform relevant steps in the embodiments of the value-table-based text value calculation method described above.
In particular, program 410 may include program code including computer-operating instructions.
The processor 402 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included by the computing device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
Memory 406 for storing programs 410. Memory 406 may comprise high-speed RAM memory or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
Program 410 may be specifically operable to cause processor 402 to perform a value table-based text value calculation method in any of the method embodiments described above. The specific implementation of each step in the procedure 410 may refer to the corresponding descriptions in the corresponding steps and units in the text value calculation embodiment based on the value table, which are not described herein. It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus and modules described above may refer to corresponding procedure descriptions in the foregoing method embodiments, which are not repeated herein.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It should be appreciated that the teachings of embodiments of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of preferred embodiments of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., an embodiment of the invention that is claimed, requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functionality of some or all of the components according to embodiments of the present invention may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). Embodiments of the present invention may also be implemented as a device or apparatus program (e.g., a computer program and a computer program product) for performing a portion or all of the methods described herein. Such a program embodying the embodiments of the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. Embodiments of the invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specifically stated.

Claims (7)

1. A text value calculation method based on a value table is characterized by comprising the following steps:
word segmentation is carried out on the text to obtain a keyword set containing a plurality of keywords;
traversing the keyword set based on a preset value table, and inquiring the preset value table aiming at any keyword to obtain a node keyword matched with the keyword; classifying the node keywords according to the levels of the nodes to which the node keywords belong to obtain matching node sets of different levels; wherein the preset value table comprises a plurality of preset level nodes; each node includes a node key; the preset multiple level nodes comprise: core nodes, secondary core nodes and peripheral nodes; each node further comprises: node number, node frequency, related nodes and similar nodes; the matching node set comprises a core node set, a secondary core node set and a peripheral node set; the preset value scale: splitting the first text into a plurality of sentences, performing first word segmentation on the sentences, and acquiring part-of-speech information, grammar dependency relationship and semantic dependency relationship information of each first word; the first text is any new text; extracting to-be-processed word segments according to the part-of-speech information, the grammar dependency relationship and the semantic dependency relationship information of each first word segment, and filtering the to-be-processed word segments to obtain a to-be-processed word segment set; the filtering processing comprises stop word filtering, digital filtering, low-frequency character name filtering, digital word filtering, part-of-speech filtering and keyword filtering; extracting a word segmentation feature set of the word segmentation set to be processed, a core feature set of a core node keyword of the preset value table, a secondary core feature set of a secondary core node keyword and a peripheral feature set of a peripheral node keyword based on a preset model; calculating according to the number of the word segmentation feature set, the core feature set and the core node keywords to obtain the core similarity of each word segment in the word segmentation set to be processed, calculating according to the number of the word segmentation feature set, the secondary core feature set and the secondary core node keywords to obtain the secondary core similarity of each word segment in the word segmentation set to be processed, and calculating according to the number of the word segmentation feature set, the peripheral feature set and the peripheral node keywords to obtain the peripheral similarity of each word segment in the word segmentation set to be processed; traversing the word segmentation set to be processed, comparing the core similarity of the word segments with a preset core threshold value for any word segment, and adding the word segment into the preset value table if the core similarity is greater than or equal to the preset core threshold value; if the core similarity is smaller than the preset core threshold, comparing the sub-core similarity of the segmented word with a preset sub-core threshold, and if the sub-core similarity is larger than or equal to the preset sub-core threshold, adding the segmented word into the preset value table; if the secondary core similarity is smaller than the preset secondary core threshold, comparing the peripheral similarity of the segmented words with a preset peripheral threshold, and if the peripheral similarity is larger than or equal to the preset peripheral threshold, adding the segmented words into the preset value table;
According to the number and the weight of the matched node sets of different levels, calculating to obtain a first product of the number of the core node sets and the weight of the core node, a second product of the number of the secondary core node sets and the weight of the secondary core node, a third product of the number of the peripheral node sets and the weight of the peripheral node, and a fourth product of the number of the keyword sets and the weight of the core node; the core node weight is obtained by carrying out normalization processing on the first sum value of each node keyword in the core node set; the first sum value is obtained by accumulating the sum of the product of the number of related nodes and similar nodes of each node keyword in the core node set and preset weights and the node frequency of the node keywords; the secondary core node weight is obtained by carrying out normalization processing on the second sum value of each node keyword in the secondary core node set; the second sum value is obtained according to the sum of the product of the number of related nodes and similar nodes of each node keyword in the secondary core node set and preset weights and the node frequency of the node keywords; the peripheral node weight is obtained by carrying out normalization processing on the third sum value of each node keyword in the peripheral node set; the third sum value is obtained according to the sum of the product of the number of related nodes and similar nodes of each node keyword in the peripheral node set and the preset weight and the node frequency of the node keywords; and accumulating the first product, the second product and the third product, calculating the ratio of the accumulated result to the fourth product, and correcting the ratio according to a preset index to obtain the value data of the text.
2. The method according to claim 1, wherein the method further comprises:
and if the preset value list is queried, node keywords matched with the keywords are not obtained, and the keywords are classified into a non-value matching node set.
3. The method according to claim 2, wherein the method further comprises:
and if the matching node set is a non-value matching node set, determining that the value data of the text is 0.
4. The method of claim 1, wherein the word segmentation of the text to obtain a keyword set comprising a plurality of keywords further comprises:
preprocessing the text; the preprocessing comprises format filtering processing and stop word filtering processing;
processing the text according to punctuation marks, and splitting the text into a plurality of sentences;
word segmentation processing is carried out on each sentence, and each phrase contained in each sentence is obtained;
and combining the phrases based on a preset expansion word list to obtain corresponding keywords to form a keyword set.
5. A value meter-based text value computing device, the device comprising:
The word segmentation module is suitable for carrying out word segmentation processing on the text to obtain a keyword set containing a plurality of keywords;
the matching module is suitable for traversing the keyword set based on a preset value table, inquiring the preset value table aiming at any keyword to obtain a node keyword matched with the keyword; classifying the node keywords according to the levels of the nodes to which the node keywords belong to obtain matching node sets of different levels; wherein the preset value table comprises a plurality of preset level nodes; each node includes a node key; the preset multiple level nodes comprise: core nodes, secondary core nodes and peripheral nodes; each node further comprises: node number, node frequency, related nodes and similar nodes; the matching node set comprises a core node set, a secondary core node set and a peripheral node set; the preset value scale: splitting the first text into a plurality of sentences, performing first word segmentation on the sentences, and acquiring part-of-speech information, grammar dependency relationship and semantic dependency relationship information of each first word; the first text is any new text; extracting to-be-processed word segments according to the part-of-speech information, the grammar dependency relationship and the semantic dependency relationship information of each first word segment, and filtering the to-be-processed word segments to obtain a to-be-processed word segment set; the filtering processing comprises stop word filtering, digital filtering, low-frequency character name filtering, digital word filtering, part-of-speech filtering and keyword filtering; extracting a word segmentation feature set of the word segmentation set to be processed, a core feature set of a core node keyword of the preset value table, a secondary core feature set of a secondary core node keyword and a peripheral feature set of a peripheral node keyword based on a preset model; calculating according to the number of the word segmentation feature set, the core feature set and the core node keywords to obtain the core similarity of each word segment in the word segmentation set to be processed, calculating according to the number of the word segmentation feature set, the secondary core feature set and the secondary core node keywords to obtain the secondary core similarity of each word segment in the word segmentation set to be processed, and calculating according to the number of the word segmentation feature set, the peripheral feature set and the peripheral node keywords to obtain the peripheral similarity of each word segment in the word segmentation set to be processed; traversing the word segmentation set to be processed, comparing the core similarity of the word segments with a preset core threshold value for any word segment, and adding the word segment into the preset value table if the core similarity is greater than or equal to the preset core threshold value; if the core similarity is smaller than the preset core threshold, comparing the sub-core similarity of the segmented word with a preset sub-core threshold, and if the sub-core similarity is larger than or equal to the preset sub-core threshold, adding the segmented word into the preset value table; if the secondary core similarity is smaller than the preset secondary core threshold, comparing the peripheral similarity of the segmented words with a preset peripheral threshold, and if the peripheral similarity is larger than or equal to the preset peripheral threshold, adding the segmented words into the preset value table;
The value calculation module is suitable for calculating to obtain a first product of the number of the core node sets and the weight of the core node, a second product of the number of the secondary core node sets and the weight of the secondary core node, a third product of the number of the peripheral node sets and the weight of the peripheral node, and a fourth product of the number of the keyword sets and the weight of the core node according to the number and the weight of the matching node sets of different levels; the core node weight is obtained by carrying out normalization processing on the first sum value of each node keyword in the core node set; the first sum value is obtained by accumulating the sum of the product of the number of related nodes and similar nodes of each node keyword in the core node set and preset weights and the node frequency of the node keywords; the secondary core node weight is obtained by carrying out normalization processing on the second sum value of each node keyword in the secondary core node set; the second sum value is obtained according to the sum of the product of the number of related nodes and similar nodes of each node keyword in the secondary core node set and preset weights and the node frequency of the node keywords; the peripheral node weight is obtained by carrying out normalization processing on the third sum value of each node keyword in the peripheral node set; the third sum value is obtained according to the sum of the product of the number of related nodes and similar nodes of each node keyword in the peripheral node set and the preset weight and the node frequency of the node keywords; and accumulating the first product, the second product and the third product, calculating the ratio of the accumulated result to the fourth product, and correcting the ratio according to a preset index to obtain the value data of the text.
6. A computing device, comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to store at least one executable instruction that causes the processor to perform operations corresponding to the value table-based text value calculation method according to any one of claims 1 to 4.
7. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the value table-based text value calculation method of any one of claims 1 to 4.
CN202310596067.5A 2023-05-24 2023-05-24 Text value calculation method and device based on value scale Active CN116681056B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310596067.5A CN116681056B (en) 2023-05-24 2023-05-24 Text value calculation method and device based on value scale

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310596067.5A CN116681056B (en) 2023-05-24 2023-05-24 Text value calculation method and device based on value scale

Publications (2)

Publication Number Publication Date
CN116681056A CN116681056A (en) 2023-09-01
CN116681056B true CN116681056B (en) 2024-01-26

Family

ID=87786455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310596067.5A Active CN116681056B (en) 2023-05-24 2023-05-24 Text value calculation method and device based on value scale

Country Status (1)

Country Link
CN (1) CN116681056B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117669550B (en) * 2023-11-13 2024-04-30 东风日产数据服务有限公司 Topic mining method, system, equipment and medium based on text center

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007183796A (en) * 2006-01-06 2007-07-19 Pma:Kk Business evaluation value calculation system
CN108319587A (en) * 2018-02-05 2018-07-24 中译语通科技股份有限公司 A kind of public sentiment value calculation method and system of more weights, computer
CN109062905A (en) * 2018-09-04 2018-12-21 武汉斗鱼网络科技有限公司 A kind of barrage value of edition evaluation method, device, equipment and medium
CN109753562A (en) * 2019-02-11 2019-05-14 杭州乾博科技有限公司 A kind of instant messaging news value appraisal procedure and system
CN109885681A (en) * 2019-01-25 2019-06-14 中译语通科技股份有限公司 A kind of patent value degree calculation method based on computer technology bibliographic data base
KR20190104745A (en) * 2018-03-02 2019-09-11 국민대학교산학협력단 Issue interest based news value evaluation apparatus and method, storage media storing the same
CN110347800A (en) * 2019-07-15 2019-10-18 中国工商银行股份有限公司 Text handling method and device and electronic equipment and readable storage medium storing program for executing
CN110866389A (en) * 2018-08-17 2020-03-06 北大方正集团有限公司 Information value evaluation method, device, equipment and computer readable storage medium
CN111930962A (en) * 2020-09-02 2020-11-13 平安国际智慧城市科技股份有限公司 Document data value evaluation method and device, electronic equipment and storage medium
KR20200137924A (en) * 2019-05-29 2020-12-09 경희대학교 산학협력단 Real-time keyword extraction method and device in text streaming environment
CN112417088A (en) * 2019-08-19 2021-02-26 武汉渔见晚科技有限责任公司 Evaluation method and device for text value in community

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050071174A1 (en) * 2001-07-31 2005-03-31 Leibowitz Mark Harold Method and system for valuing intellectual property

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007183796A (en) * 2006-01-06 2007-07-19 Pma:Kk Business evaluation value calculation system
CN108319587A (en) * 2018-02-05 2018-07-24 中译语通科技股份有限公司 A kind of public sentiment value calculation method and system of more weights, computer
KR20190104745A (en) * 2018-03-02 2019-09-11 국민대학교산학협력단 Issue interest based news value evaluation apparatus and method, storage media storing the same
CN110866389A (en) * 2018-08-17 2020-03-06 北大方正集团有限公司 Information value evaluation method, device, equipment and computer readable storage medium
CN109062905A (en) * 2018-09-04 2018-12-21 武汉斗鱼网络科技有限公司 A kind of barrage value of edition evaluation method, device, equipment and medium
CN109885681A (en) * 2019-01-25 2019-06-14 中译语通科技股份有限公司 A kind of patent value degree calculation method based on computer technology bibliographic data base
CN109753562A (en) * 2019-02-11 2019-05-14 杭州乾博科技有限公司 A kind of instant messaging news value appraisal procedure and system
KR20200137924A (en) * 2019-05-29 2020-12-09 경희대학교 산학협력단 Real-time keyword extraction method and device in text streaming environment
CN110347800A (en) * 2019-07-15 2019-10-18 中国工商银行股份有限公司 Text handling method and device and electronic equipment and readable storage medium storing program for executing
CN112417088A (en) * 2019-08-19 2021-02-26 武汉渔见晚科技有限责任公司 Evaluation method and device for text value in community
CN111930962A (en) * 2020-09-02 2020-11-13 平安国际智慧城市科技股份有限公司 Document data value evaluation method and device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
短文本的情报价值评估方法研究;章宁;;舰船电子工程(第01期);全文 *

Also Published As

Publication number Publication date
CN116681056A (en) 2023-09-01

Similar Documents

Publication Publication Date Title
CN108197117B (en) Chinese text keyword extraction method based on document theme structure and semantics
JP6335898B2 (en) Information classification based on product recognition
CN112069298A (en) Human-computer interaction method, device and medium based on semantic web and intention recognition
CN110413787B (en) Text clustering method, device, terminal and storage medium
CN107357777B (en) Method and device for extracting label information
CN110232112A (en) Keyword extracting method and device in article
CN116681056B (en) Text value calculation method and device based on value scale
CN109446393B (en) Network community topic classification method and device
CN109271524A (en) Entity link method in knowledge base question answering system
CN111428031A (en) Graph model filtering method fusing shallow semantic information
CN111475608A (en) Mashup service characteristic representation method based on functional semantic correlation calculation
CN109753646B (en) Article attribute identification method and electronic equipment
CN113127607A (en) Text data labeling method and device, electronic equipment and readable storage medium
CN116127079B (en) Text classification method
CN109344397B (en) Text feature word extraction method and device, storage medium and program product
CN116561320A (en) Method, device, equipment and medium for classifying automobile comments
CN114943285B (en) Intelligent auditing system for internet news content data
CN114969324A (en) Chinese news title classification method based on subject word feature expansion
CN114595684A (en) Abstract generation method and device, electronic equipment and storage medium
CN114239539A (en) English composition off-topic detection method and device
CN114444491A (en) New word recognition method and device
CN113254586A (en) Unsupervised text retrieval method based on deep learning
CN112667779A (en) Information query method and device, electronic equipment and storage medium
CN113378562B (en) Word segmentation processing method, device, computing equipment and storage medium
CN110909533B (en) Resource theme judging method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant