CN111488725A - Machine intelligent auxiliary root-pricking theoretical coding optimization method - Google Patents

Machine intelligent auxiliary root-pricking theoretical coding optimization method Download PDF

Info

Publication number
CN111488725A
CN111488725A CN202010178957.0A CN202010178957A CN111488725A CN 111488725 A CN111488725 A CN 111488725A CN 202010178957 A CN202010178957 A CN 202010178957A CN 111488725 A CN111488725 A CN 111488725A
Authority
CN
China
Prior art keywords
coding
corpus
data
concept
new
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010178957.0A
Other languages
Chinese (zh)
Other versions
CN111488725B (en
Inventor
卢暾
蒋特
顾宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN202010178957.0A priority Critical patent/CN111488725B/en
Publication of CN111488725A publication Critical patent/CN111488725A/en
Application granted granted Critical
Publication of CN111488725B publication Critical patent/CN111488725B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of qualitative research, and particularly relates to a machine intelligent auxiliary root-tying theoretical coding optimization method. The core of the parent optimization method is embodied in two links: feature extraction and automatic coding classification: the feature extraction is to extract the text information features under the same classification according to the design that the text has higher feature consistency on information in the same coding classification and is used as the classification basis of the subsequent automatic classification link; the automatic coding is to calculate the similarity between the new text and each classified corpus according to the classification characteristics extracted in the characteristic extraction link and classify the new text into the classification with the highest similarity; in the whole encoding process, the processes of manual adjustment, feature re-extraction and the like are combined to obtain a more accurate encoding result. The invention integrates machine intelligent technology in the classic root theory coding process to optimize the coding process and improve the efficiency of researchers on data processing and coding.

Description

Machine intelligent auxiliary root-pricking theoretical coding optimization method
Technical Field
The invention belongs to the technical field of qualitative research, and particularly relates to a root-tying theoretical coding optimization method.
Background
In qualitative research, the root theory is a widely adopted qualitative research method. The root theory is a particular methodology proposed by glazer and schtelus in 1967 to establish theory from data. Researchers can further supplement related materials from materials such as biographies, diaries, recordings, manuscripts and reports, or supplement interviews and field observation records, and further deeply analyze the essence of a certain phenomenon or problem on the basis of the materials.
The method for obtaining information materials by supplementing interview is a common method for researchers to develop and research social phenomena at the present stage. The method emphasizes that, starting from no theoretical hypothesis and starting from actual observation, researchers recruit respondents who accord with the characteristics of research phenomena and have related experiences. In the communication with the information materials, the information materials of one hand are obtained, deep reasons behind the mining phenomenon are analyzed in a deep interview mode, an experience mode is concluded, and then the theory of a certain height is developed.
Interview mode, the collection of raw material, necessarily involves interviews to the audience, which in turn generates large amounts of interview data. Researchers need to organize these large amounts of interview data to form a coding framework. The arrangement work usually consumes a great deal of effort of researchers, and in the actual coding process, a certain amount of repetitive work exists, a certain rule can be followed, and part of work can be replaced by a machine.
Disclosure of Invention
In order to better assist qualitative researchers to carry out the work of organizing and analyzing interview data, the invention designs a machine-assisted rooting theory coding optimization method.
Typically, in most scenarios, the collection of raw material will involve interviews to the audience, which in turn will generate large amounts of interview data. Researchers need to organize into coding frameworks in these large volumes of interview data. The sorting work usually consumes a great deal of effort of researchers, the actual coding process is carried out according to certain logic steps, certain rules can be followed, and the machine can replace part of sorting and classifying work. Therefore, the invention provides an optimization method for the encoding process. The encoding process is shown in fig. 1. The specific steps of the method of the present invention are described below.
(1) Data pre-processing
After obtaining the interview recording data, researchers can use transcription software or a platform to transcribe the recording data and obtain corresponding text materials in a manual carding mode.
Subsequently, the interview record is cut into sentence blocks through a sentence segmentation tool; and the sentence segmentation result is properly adjusted in a manual checking and checking mode to obtain a corpus which is used as an original material of the code.
(2) Artificial precoding
And (3) carrying out manual pre-coding on the corpus set obtained in the step (1) to form a preliminary coding scheme. In the pre-coding algorithm, the selected original material is coded in a concept level and a theme level in a mode of cyclic coding and random data selection, and a coding frame is continuously adjusted until preliminary information saturation is achieved or the data of the current data set is completely coded; in addition, the algorithm also supports continuous coding of new data on the basis of the original coding, and has higher flexibility. Thus, when information saturation is not reached, or when the user considers that encoding is not complete, new data can continue to be encoded. The precoding algorithm is shown in appendix 1, and the flow is as follows:
each encoding process can be continued on the result set of the last encoding or can be performed on the empty encoding result. Each time the encoding process is in a new data set, uncoded data PD is randomly selected (algorithm lines 4-11). Generating a concept CN corresponding to the data for the data in a manual coding mode (line 12 of the algorithm); then, in the current encoding result CT, searching whether corresponding concepts exist in the subject set TS one by one (13 th-22 th lines of the algorithm); if the concept already exists, the concept is added to the corresponding topic, the corresponding set of concepts (lines 23-29 of the algorithm).
The process of loop coding continues until information saturation is reached or the current new data set has been completely coded (row 3 of the algorithm). The determination of information saturation depends on the comparison between the ISV _ cnt value and the ISV value. The meaning of the ISV value is that after ISV strip data are continuously coded for a plurality of times defined by a user, the total concept number in the coding result is still unchanged, and then the information saturation is considered to be achieved. The ISV _ cnt value is the number of times that the total number of concepts in the continuous encoding results does not change in the recording encoding process. As long as a reasonable ISV _ cnt value is set, the judgment accuracy of information saturation can be basically guaranteed. If new encoded data still needs to be supplemented, the encoding can be supplemented by only operating the algorithm 1 again with the previous encoding result and the new data.
Judging whether the encoding of the data set is finished or not, namely judging according to the sizes of the selected data set and the full data set, and if the sizes of the selected data set and the full data set are equal, representing that the encoding of the data set is finished; if the size of the selected data set is less than the full data set size, then the representation is not yet encoded.
(3) Coding feature extraction
The extraction of the coding features can be performed by adopting an applicable feature extraction method according to the corpus, including but not limited to methods such as TF-IDF, &lTtTtranslation = L "&gTtL &/T &gTtDA, neural network, and the like.
The TF-IDF method is a common statistical method for evaluating the importance degree of a word or a word for a certain file in a specific corpus set. When the TF-IDF value is large, it is generally considered that the word can better characterize the importance of the document.
Wherein, TF represents Term Frequency. As shown in formula 1, the number of times word _ cnt the word appears in the corpus may be divided by the total number of words total _ cnt in the corpus to calculate. (for continuous corpus of Chinese, etc., the sentence can be cut by means of word segmentation tool to obtain corresponding word)
Figure BDA0002411829810000031
The IDF is an Inverse document frequency, which is used to measure the general importance of a word. The calculation method is as shown in formula 2, the total file number total _ file may be divided by the file number file _ cnt including the term, and the obtained quotient is taken as a logarithm with the base 10, so as to obtain the IDF value.
Figure BDA0002411829810000032
Finally, the TF is multiplied by the IDF to obtain the TF-IDF, as shown in equation 3:
TF-IDF ═ TF × IDF (formula 3)
For the application scenario herein. The extraction of the characteristics can be divided into two levels. The method comprises the steps of corpus feature extraction based on a theme level and corpus feature coding based on a concept level.
And extracting features based on the theme hierarchy, wherein all the linguistic data under the theme are used as one material, and the linguistic data under all the themes are used as a corpus set. And calculating the words with the highest TF-IDF value of the specified num _ topic number as the characteristics of the corpus. The concept level feature extraction is to use all corpora under a certain concept as a material and all corpora under the subject to which the concept belongs as a corpus set. And extracting num _ concept words with the highest number of TF-IDF as the characteristics of the corpus.
On the basis of automatic extraction, the feature words of corresponding topics and concepts can be further manually adjusted and supplemented, so that the classification accuracy is higher.
(4) Automatic coding
On the basis of the feature extraction in the step 3, the new corpus can be subjected to coding classification, and the corpus in the coding frame is supplemented. Here, the features extracted in step 3 are continued, and the new corpus is automatically encoded and classified by the TF-IDF method.
For a Chinese corpus, a feature extraction method of TF-IDF is adopted, and word segmentation needs to be carried out on a text material. And after removing the common words, taking the residual words as the characteristic words of the corpus. Then, the matching degree of the text with the corresponding concept and topic classification is calculated through the words, and the text is classified under the topic classification and the coding classification with the highest matching degree. The similarity between a new corpus t and a corpus s is calculated as shown in formula 4.
Figure BDA0002411829810000033
Wherein m and n are the number of the feature words of the new corpus t and the corpus s respectively. score (t)i,sj) The similarity score between the ith vocabulary in the corpus t and the jth vocabulary in the corpus s is represented by the following formula 5
Figure BDA0002411829810000034
In score (x, y) similarity gain calculation for two words, dis (x, y) represents the spatial distance of vocabulary x and vocabulary y in the word vector dataset. And threshold represents the maximum spatial distance at which two words are evaluated to still belong to a synonym.
After the similarity of all subject corpus of the new corpus is calculated, the new corpus can be distributed under the subject with the highest similarity. Then, the similarity between the corpus and all concept corpus sets under the theme is calculated, and the theme is divided into corresponding concept corpus sets. In addition, a corresponding threshold value can be set, and when the similarity matching is low, the judgment and the adjustment are needed manually. The threshold needs to be set specifically according to the specific corpus and word vector data set. Usually, several pairs of unmatched vocabulary pairs can be taken, the average similarity of the unmatched vocabulary pairs is calculated, and a larger numerical value, such as 0.2, is added to the value as the threshold value for judgment according to the actual situation.
(5) Feature set expansion
After the new data coding is finished every time, the classification item with low matching degree can be artificially checked and adjusted. If a new concept or new theme occurs outside the existing coding framework, the coding framework is adjusted. Subsequently, the feature set of the new corpus is extracted again in the manner of step 3, and is retained in the corpus. And continuously repeating the encoding process until all data are encoded.
The invention aims at the problem that a large amount of energy is needed to carry out corpus coding based on the coding stage in the root-tying theory. Based on the classic encoding process and operation method in the root theory research, the encoding process and the encoding method which are integrated with machine intelligence are designed. By the aid of the machine intelligence fused classified coding method, coding efficiency in qualitative research by root theory can be effectively improved.
In the traditional qualitative research coding method, researchers usually need to go through the following stages for coding analysis:
(1) transcribing the recording material into character contents;
(2) extracting related sentence content, and marking corresponding concepts and topics for the sentences;
(3) on the basis of sufficient sentence content, a preliminary coding framework is formed through discussion;
(4) distributing the sentences in the subsequent transcription draft to a coding frame, and dynamically adjusting the coding frame;
(5) and carrying out related research analysis according to the coding frame and the related linguistic data.
After the coding optimization method provided by the invention is adopted, machine intelligence can be used for assistance in multiple stages, so that complicated manual operation is saved, and the method is mainly divided into the following aspects:
(1) the transcription of the recording material can realize direct transcription by means of a recording character-to-character tool;
(2) in the stage of extracting related statement contents, the original corpus can be segmented by means of a sentence segmentation tool, and then related predictions are selected by researchers;
(3) after a preliminary coding frame is formed, the residual corpora can be directly and automatically coded and classified, and a researcher only needs to pay attention to the corpora with low matching degree and make certain adjustment.
By adopting the qualitative research method provided by the invention, machine intelligence can be utilized in multiple stages, the coding process can be obviously assisted and optimized, and the coding efficiency of researchers is improved.
Drawings
FIG. 1 is a core encoding flow chart of the method of the present invention.
Detailed Description
The pseudo code for realizing the intelligent coding classification of the fusion machine is shown in appendix 1.
The inventive content herein is a machine-assisted encoding classification flow method. The example introduction is performed in the introduction of the coding feature extraction and automatic coding links by taking the TF-IDF method as the classification standard, and the brief introduction is described below.
(1) In the data preprocessing link, a voice-to-text tool can be used for transcribing the voice material to obtain a text material; meanwhile, the text can be processed by means of the clauses to obtain a corpus.
(2) In the manual pre-coding link, a coding frame can be determined in a mode of independent coding and negotiation by a plurality of researchers. For example, 3 parts of the audio recording material are selected as the pre-encoded corpus.
(3) In the feature extraction step, feature extraction can be performed on the coded data, and features of each topic and each concept are extracted for subsequent automatic coding.
(4) In the automatic coding link, the method adopted here takes TF-IDF as an example, can calculate the similarity of two linguistic data, and distributes the similarity to the theme and the concept with the highest similarity for automatic coding. For the corpus with low similarity, the corpus can be recoded, and if the existing coding frame does not contain a new coding result, the coding frame can be adjusted. For the remaining 27 corpora, the encoding can be performed by automatic encoding.
(5) On the final encoding scheme, researchers can develop further research according to the corresponding encoding result.
The coding optimization method of the invention fully utilizes machine intelligent technology to assist in developing the coding process of qualitative research in each link of qualitative research coding.
The above description is only exemplary of the present invention and should not be taken as limiting the invention, as any modification, equivalent replacement, or improvement made within the spirit and scope of the present invention is also included in the present invention.
Appendix 1
Figure BDA0002411829810000051
Figure BDA0002411829810000061

Claims (4)

1. A machine-intelligent-assisted root-tying theoretical coding optimization method is characterized by comprising the following specific steps:
(1) data pre-processing
After interview recording data are obtained, the recording data are transcribed by transcription software or a platform, and corresponding text materials are obtained in a manual carding mode;
then, cutting the interview record into sentence blocks through a sentence segmentation tool; the sentence segmentation result is properly adjusted in a manual checking and checking mode to obtain a corpus which is used as an original material of the code;
(2) artificial precoding
Performing manual pre-coding on the corpus obtained in the step (1) to form a preliminary coding scheme; in the pre-coding algorithm, the selected original material is coded in a concept level and a theme level in a mode of cyclic coding and random data selection, and a coding frame is continuously adjusted until preliminary information saturation is achieved or the data of the current data set is completely coded; in addition, new data can be continuously coded on the basis of the original coding, and the method has higher flexibility; therefore, when the information saturation is not reached or when the user considers that the encoding is not completed, new data can be continuously encoded;
(3) coding feature extraction
On the basis of a pre-coding scheme, extracting coding characteristics to realize automatic classification coding of subsequent data; extracting coding characteristics by adopting a TF-IDF method; wherein, TF represents Term Frequency; dividing the number word _ cnt of the word appearing in the corpus by the total word number _ cnt in the corpus to calculate, as shown in formula 1:
Figure DEST_PATH_IMAGE002
equation 1
The IDF refers to Inverse document frequency, and is used for measuring the universal importance of a word; dividing the total file number total _ file by the file number file _ cnt containing the term, and then taking the logarithm with the base 10 as the obtained quotient to obtain the value of the IDF, wherein the calculation formula is shown as a formula 2:
Figure DEST_PATH_IMAGE004
equation 2
Finally, multiplying the TF and the IDF to obtain the TF-IDF value, as shown in formula 3:
TF-IDF=TF*IDFequation 3
(4) Automatic coding
On the basis of the feature extraction in the step 3, carrying out coding classification on the new corpus and supplementing the corpus in the coding frame; here, the feature method extracted in step 3 is continued, and the new corpus is automatically encoded and classified by the TF-IDF method;
for a Chinese corpus, firstly, word segmentation is carried out on a text material; after removing the common words, taking the residual words as the characteristic words of the corpus; then, calculating the matching degree of the text with the corresponding concept and topic classification through the words, and classifying the text into the topic classification and the coding classification with the highest matching degree;
specifically, the similarity between a new corpus t and a corpus s is calculated as shown in formula 4:
Figure DEST_PATH_IMAGE006
equation 4
Here, m and n are the number of the feature words of the new corpus t and the corpus s respectively; score (t)i,sj) The score of the similarity between the ith vocabulary in the corpus t and the jth vocabulary in the corpus s is represented by the following specific calculation method shown in formula 5:
Figure DEST_PATH_IMAGE008
equation 5
Where dis (x, y) represents the spatial distance of vocabulary x and vocabulary y in the word vector dataset, and threshold represents: evaluating the maximum spatial distance of two words still belonging to a synonym;
after calculating the similarity of all subject corpus sets of the new corpus, distributing the new corpus to a subject with the highest similarity; then, calculating the similarity between the corpus and all concept corpus sets under the theme, and dividing the theme into corresponding concept corpus sets;
(5) feature set expansion
After the new data coding is finished every time, manually checking the classification item with low matching degree and adjusting; if a new concept or a new theme outside the existing coding frame occurs, the coding frame is adjusted; then, extracting a feature set of the new corpus in the mode in the step (3) again, and reserving the feature set in the corpus; and continuously repeating the encoding process until all data are encoded.
2. The machine-intelligent-assisted rooting theory coding optimization method according to claim 1, wherein the precoding algorithm in the step (2) is as follows:
each coding process is carried out continuously on a result set of the last coding or on an empty coding result; in each coding process, uncoded data PD is randomly selected in a new data set; generating a corresponding concept CN for the data in a manual coding mode; then, in the current encoding result CT, searching whether corresponding concepts exist in the subject set TS one by one; if the concept exists, adding the concept into the corresponding theme and the corresponding concept set;
the process of circular coding is continued until information saturation is achieved, or the current new data set is completely coded; wherein, the judgment of information saturation depends on the comparison between the ISV _ cnt value and the ISV value; the meaning of the ISV value is that after ISV strip data are continuously coded for many times defined by a user, the total concept number in a coding result is still unchanged, and information saturation is considered to be achieved; the ISV _ cnt value is the number of times that the total concept number in the continuous coding results does not change in the coding process;
if the new encoded data still needs to be supplemented, the encoding result of the previous encoding and the new data need to be operated again to supplement the encoding.
3. The machine-intelligent-assisted rooting theory coding optimization method according to claim 1, wherein in the step (3), for an application scenario of a text, the feature extraction is divided into two levels, namely corpus feature extraction based on a theme level and corpus feature coding based on a concept level;
the method comprises the steps of extracting features based on a theme hierarchy, taking all linguistic data under the theme as a material, taking the linguistic data under the theme as a linguistic data set, and calculating the words with the highest TF-IDF value in the number of specified num _ topic as the features of the linguistic data set; extracting the characteristic of concept level, namely taking all linguistic data under the concept as a material, taking all linguistic data under the theme to which the concept belongs as a linguistic data set, extracting num _ concept number of words with the highest TF-IDF number from the linguistic data set as the characteristic of the linguistic data set, wherein a coding frame can be divided into two levels, the first level is the theme, and a plurality of themes are arranged; the second level is concepts, and a plurality of concepts are arranged under each level topic.
4. The machine-intelligent-assisted rooting theory coding optimization method according to claim 1, wherein in the step (4), corresponding threshold values are set, and when the similarity matching is low, judgment and adjustment are required manually.
CN202010178957.0A 2020-03-15 2020-03-15 Machine intelligent auxiliary root-pricking theoretical coding optimization method Active CN111488725B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010178957.0A CN111488725B (en) 2020-03-15 2020-03-15 Machine intelligent auxiliary root-pricking theoretical coding optimization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010178957.0A CN111488725B (en) 2020-03-15 2020-03-15 Machine intelligent auxiliary root-pricking theoretical coding optimization method

Publications (2)

Publication Number Publication Date
CN111488725A true CN111488725A (en) 2020-08-04
CN111488725B CN111488725B (en) 2023-04-07

Family

ID=71794403

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010178957.0A Active CN111488725B (en) 2020-03-15 2020-03-15 Machine intelligent auxiliary root-pricking theoretical coding optimization method

Country Status (1)

Country Link
CN (1) CN111488725B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020049705A1 (en) * 2000-04-19 2002-04-25 E-Base Ltd. Method for creating content oriented databases and content files
CN107153664A (en) * 2016-03-04 2017-09-12 同方知网(北京)技术有限公司 A kind of method flow that research conclusion is simplified based on the scientific and technical literature mark that assemblage characteristic is weighted
CN110825877A (en) * 2019-11-12 2020-02-21 中国石油大学(华东) Semantic similarity analysis method based on text clustering

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020049705A1 (en) * 2000-04-19 2002-04-25 E-Base Ltd. Method for creating content oriented databases and content files
CN107153664A (en) * 2016-03-04 2017-09-12 同方知网(北京)技术有限公司 A kind of method flow that research conclusion is simplified based on the scientific and technical literature mark that assemblage characteristic is weighted
CN110825877A (en) * 2019-11-12 2020-02-21 中国石油大学(华东) Semantic similarity analysis method based on text clustering

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
田大芳等: "基于关键词的期刊发文的相似性测度研究", 《现代情报》 *
谢雁鸣等: "基于扎根理论的定性数据主题抽题分析法探析", 《辽宁中医杂志》 *

Also Published As

Publication number Publication date
CN111488725B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN110807328B (en) Named entity identification method and system for legal document multi-strategy fusion
CN110750635B (en) French recommendation method based on joint deep learning model
CN109670041A (en) A kind of band based on binary channels text convolutional neural networks is made an uproar illegal short text recognition methods
CN107679031B (en) Advertisement and blog identification method based on stacking noise reduction self-coding machine
CN111177386B (en) Proposal classification method and system
CN110807324A (en) Video entity identification method based on IDCNN-crf and knowledge graph
CN111897930A (en) Automatic question answering method and system, intelligent device and storage medium
CN109446423B (en) System and method for judging sentiment of news and texts
CN112307130B (en) Document-level remote supervision relation extraction method and system
CN115186654B (en) Method for generating document abstract
CN107818173B (en) Vector space model-based Chinese false comment filtering method
CN111460158A (en) Microblog topic public emotion prediction method based on emotion analysis
CN111858842A (en) Judicial case screening method based on LDA topic model
CN113065349A (en) Named entity recognition method based on conditional random field
CN114942990A (en) Few-sample abstract dialogue abstract generation system based on prompt learning
CN110175332A (en) A kind of intelligence based on artificial neural network is set a question method and system
CN112200674B (en) Stock market emotion index intelligent calculation information system
CN111488725B (en) Machine intelligent auxiliary root-pricking theoretical coding optimization method
CN115310429B (en) Data compression and high-performance calculation method in multi-round listening dialogue model
CN111460147A (en) Title short text classification method based on semantic enhancement
CN115795026A (en) Chinese text abstract generation method based on comparative learning
CN115840815A (en) Automatic abstract generation method based on pointer key information
CN114969511A (en) Content recommendation method, device and medium based on fragments
CN114880635A (en) User security level identification method, system, electronic device and medium of model integrated with lifting tree construction
CN114282498A (en) Data knowledge processing system applied to electric power transaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant