CN113420554A - Ancient poetry word frequency analysis method and system - Google Patents

Ancient poetry word frequency analysis method and system Download PDF

Info

Publication number
CN113420554A
CN113420554A CN202110675786.7A CN202110675786A CN113420554A CN 113420554 A CN113420554 A CN 113420554A CN 202110675786 A CN202110675786 A CN 202110675786A CN 113420554 A CN113420554 A CN 113420554A
Authority
CN
China
Prior art keywords
list
word frequency
poems
data set
mapping table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110675786.7A
Other languages
Chinese (zh)
Other versions
CN113420554B (en
Inventor
韩珍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zaozhuang Vocational College of Science and Technology
Original Assignee
Zaozhuang Vocational College of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zaozhuang Vocational College of Science and Technology filed Critical Zaozhuang Vocational College of Science and Technology
Priority to CN202110675786.7A priority Critical patent/CN113420554B/en
Publication of CN113420554A publication Critical patent/CN113420554A/en
Application granted granted Critical
Publication of CN113420554B publication Critical patent/CN113420554B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure relates to an ancient poetry word frequency analysis method, which comprises: acquiring a first data set comprising ancient poems, and constructing a first document according to the first data set, wherein the first data set at least comprises M poems; performing word frequency analysis on the first document to obtain a first list representing word frequency sequencing, and establishing a first mapping table from keywords in the first list to names of M poems in a first data set according to the first list; removing the virtual words in the first list to generate a second list according to virtual word information preset in a virtual word library, and updating the first mapping table to form a second mapping table according to the second list; screening at least one keyword which accords with preset conditions and has the highest word frequency sequence in a second list according to preset conditions of a user, and determining the names of N poems according to the corresponding relation between the keyword and the second mapping table; and respectively displaying the poetry contents according to the names of the N poetry.

Description

Ancient poetry word frequency analysis method and system
Technical Field
The invention relates to an information processing method, in particular to an ancient poetry word frequency analysis method and system.
Background
Poetry is generally understood to refer to old physique regular poetry and words, such as popular Tang poetry and Song dynasty poetry, which belong to old physique regular poetry. Generally speaking, poetry is considered to be more suitable than "lyrics", and words are more suitable than "lyrics". The recorded Chinese poetry originally originated from the first Qin, but was prosperous in the Tang Dynasty. The Chinese word originates from the sui Tang and is popular in Song Dynasty. The Chinese poetry originates from folk, and is a grass root literature. With the culture inheritance, poems are still deeply favored by the general public in China in the 21 st century today. Moreover, the method is not only limited to the traditional literature lovers, but also is very beneficial to enhancing the national confidence and the national luxury for the common people, especially teenagers or children, to accept the fumigated pottery of the traditional poetry culture. Therefore, courses set up by numerous early infant teaching mechanisms at present all contain the teaching contents of poetry parts. Even in some electronic products such as early education story machines, poetry contents have a considerable weight. However, at present, the popular early education story tellers have different recorded poems and have no unified standard. According to statistics, only Tang poetry works, according to records of 'Quantang poetry', 55763 are the current lives. Similarly, the number of times of the Song Dynasty recorded in the full Song Dynasty only includes 20000. These are just the numbers recorded at the time and later, and only include essence of poetry, and for a large number of poetry with low popularity, the poetry may not be recorded, but the part with appreciation value cannot be excluded. In addition, it is known that, considering the psychogenic development characteristics of children and teenagers and the expression forms and contents of poems, it is easy to think that not all poems are suitable as learning appreciation materials for the young people. Therefore, effective and reasonable analysis and classification aiming at the traditional ancient poems in China are urgently needed at present to guide the early traditional literature education of children and teenagers.
Disclosure of Invention
In view of the foregoing problems in the prior art, an aspect of the present invention is to provide a method for analyzing word frequency of ancient poetry, which can push preset ancient poetry to a user in a preset time according to a user-set condition through the word frequency analysis.
In order to achieve the purpose, the ancient poetry word frequency analysis method provided by the invention comprises the following steps:
acquiring a first data set comprising ancient poems, and constructing a first document according to the first data set, wherein the first data set at least comprises M poems;
performing word frequency analysis on the first document to obtain a first list representing word frequency sequencing, and establishing a first mapping table from keywords in the first list to names of M poems in a first data set according to the first list;
removing the virtual words in the first list to generate a second list according to virtual word information preset in a virtual word library, and updating the first mapping table to form a second mapping table according to the second list; the second mapping table at least comprises classification information corresponding to poetry;
screening at least one keyword which accords with preset conditions and has the highest word frequency sequence in a second list according to preset conditions of a user, and determining the names of N poems according to the corresponding relation between the keyword and the second mapping table;
respectively displaying poetry contents according to the names of the N poetry; m is larger than N, and M and N are both natural numbers.
In the technical scheme of the invention, the classification information corresponding to poems is in a conventional classification mode, is pre-stored in equipment or a cloud, and comprises a friend presenting class, a border seeking war class, a trip village class, a song object class, a ancient song history class, a writing scene lyric class and a landscape garden class.
Preferably, the obtaining of the first data set including the ancient poems comprises obtaining pre-stored poem information from a local database, and/or obtaining pre-stored poem information from a cloud server, and/or obtaining the poem information through a WebAPI interface.
Preferably, constructing a first document from the first data set comprises:
respectively collecting each poem according to name, author name, age and content, and connecting according to a first fixed separator to form block information; the block information further comprises block sequence information;
and sequentially connecting a plurality of pieces of block information respectively corresponding to each poem according to a second fixed separator, and storing the block information in a text form to generate a first document.
Preferably, the word frequency analysis is performed on the first document to obtain a first list representing word frequency ordering, and the method includes:
performing word segmentation processing on the first document to obtain a keyword set;
removing stop words from the keyword set, wherein the stop words at least comprise author names and times;
and counting the word frequency in the keyword set to obtain a first list representing word frequency ordering.
Preferably, establishing a first mapping table from the keywords in the first list to the names of M poems in the first data set includes:
establishing an index in the first document according to the keywords in the keyword set;
acquiring block sequence information of the key words according to the indexes;
and acquiring a first mapping table of the names of the keywords and the poems according to the block sequence information.
Preferably, the counting the word frequency in the keyword set includes:
performing a cluster analysis on the set of keywords,
a first list characterizing word frequency ordering is generated based on the cluster analysis results.
Preferably, the word frequency in the keyword set is counted, and the method further comprises the step of removing the keywords with the word frequency smaller than a first preset value.
The ancient poetry word frequency analysis system provided by the invention comprises:
the data acquisition unit is configured to acquire a first data set comprising ancient poems and construct a first document according to the first data set, wherein the first data set at least comprises M poems;
the word frequency analysis unit is configured to perform word frequency analysis on the first document to obtain a first list representing word frequency sequencing, and according to the first list, establish a first mapping table from keywords in the first list to names of M poems in a first data set;
the information screening module is configured to remove virtual words in the first list to generate a second list according to virtual word information preset in a virtual word library, and update the first mapping table to form a second mapping table according to the second list; the second mapping table at least comprises classification information corresponding to poetry; screening at least one keyword which accords with preset conditions and has the highest word frequency sequence in a second list according to preset conditions of a user, and determining the names of N poems according to the corresponding relation between the keyword and the second mapping table;
the display unit is configured to respectively display poetry contents according to the names of the N poetry; m is larger than N, and M and N are both natural numbers.
Preferably, the poetry system further comprises a WebAPI interface, a cloud server and/or a storage unit, wherein the WebAPI interface is configured to obtain the first data set from a public API, and the cloud server and/or the storage unit is configured to store poetry information at least containing the first data set.
Preferably, the word frequency analyzing unit includes:
a word segmentation module configured to perform word segmentation processing on the first document to obtain a keyword set;
a stop word removing module configured to remove stop words from the keyword set, the stop words including at least author names and years;
and the word frequency counting module is configured to count the word frequency in the keyword set to obtain a first list representing word frequency ordering.
Compared with the prior art, the ancient poetry word frequency analysis and system provided by the invention can be applied to electronic products such as early education story machines and the like, a large number of ancient poetry can be preset through built-in storage of the electronic products, when a user uses the ancient poetry word frequency analysis and system, themes such as poetry, poetry sings and the like can be preset according to the characteristics of the age bracket of a child, the ancient poetry with certain entries can be randomly and orderly read in each running or in a specific time period, and then word frequency statistics is carried out, so that poetry which accords with the age bracket and the preset requirement is screened out for learning and appreciation.
Drawings
FIG. 1 is a flow chart of the ancient poetry word frequency analysis method of the present invention.
Fig. 2 is a word segmentation flow chart of the ancient poetry word frequency analysis method of the invention.
Fig. 3 is a schematic structural diagram of block information of the ancient poetry word frequency analysis method of the present invention.
Fig. 4 is a system block diagram of the ancient poetry word frequency analysis system of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
Various aspects and features of the present invention are described herein with reference to the drawings.
These and other characteristics of the invention will become apparent from the following description of a preferred form of embodiment, given as a non-limiting example, with reference to the accompanying drawings.
It should also be understood that, although the invention has been described with reference to some specific examples, a person of skill in the art shall certainly be able to achieve many other equivalent forms of the invention, having the characteristics as set forth in the claims and hence all coming within the field of protection defined thereby.
The above and other aspects, features and advantages of the present invention will become more apparent in view of the following detailed description when taken in conjunction with the accompanying drawings.
Specific embodiments of the present invention are described hereinafter with reference to the accompanying drawings; however, it is to be understood that the disclosed embodiments are merely exemplary of the invention, which can be embodied in various forms. Well-known and/or repeated functions and constructions are not described in detail to avoid obscuring the invention in unnecessary or unnecessary detail based on the user's historical actions. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present invention in virtually any appropriately detailed structure.
The specification may use the phrases "in one embodiment," "in another embodiment," "in yet another embodiment," or "in other embodiments," which may each refer to one or more of the same or different embodiments in accordance with the invention.
As shown in fig. 1, an embodiment of the present invention provides a method for analyzing term frequency of ancient poetry, including:
s1, obtaining a first data set including ancient poems, and constructing a first document according to the first data set, wherein the first data set at least includes M poems;
s2, performing word frequency analysis on the first document to obtain a first list representing word frequency sequencing, and establishing a first mapping table from the keywords in the first list to the names of M poems in a first data set according to the first list;
s3, removing the virtual words in the first list to generate a second list according to virtual word information preset in a virtual word library, and updating the first mapping table to form a second mapping table according to the second list; the second mapping table at least comprises classification information corresponding to poetry;
s4, screening at least one keyword which meets preset conditions and has the highest word frequency sequence in a second list according to preset conditions of a user, and determining the names of N poems according to the corresponding relation between the keyword and the second mapping table;
s5, respectively displaying poetry contents according to the names of the N poetry; m is larger than N, and M and N are both natural numbers. In fact, for example, when the present invention is applied to an early education story machine, the setting of M is generally much larger than N, for example, in a general early education story machine, a parent wants to explain or appreciate poems for children, and usually only needs to select 2-3 persons each time. In the current early education machines, poetry appreciation part is usually a fixed sequence or a composition of playing fixed poetry stored in the electronic equipment. In the invention, for example, 100 ancient poems can be randomly selected when the early teaching machine runs each time, then the 100 ancient poems are combined into a first document, then the word frequency analysis is carried out on the first document, the purpose of the word frequency analysis is to confirm the content of the randomly obtained ancient poems, and then one or more ancient poems with the highest word frequency can be pushed to the user according to the conditions set by the user.
In the technical scheme of the invention, the classification information corresponding to poems is in a conventional classification mode, is pre-stored in equipment or a cloud, and comprises a friend presenting class, a border seeking war class, a trip village class, a song object class, a ancient song history class, a writing scene lyric class and a landscape garden class. The classification information can be stored in a storage unit of the early education story machine and also can be stored in a cloud server for being called at any time.
In step S1 of the present invention, obtaining a first data set including ancient poems includes obtaining pre-stored poem information from a local database or a storage unit, and/or obtaining pre-stored poem information from a cloud server, and/or obtaining the poem information through a WebAPI interface. In the ancient poetry word frequency analysis system shown in FIG. 4, WebAPI adopted "today poetry" (https:// www.jinrishici.com /).
Meanwhile, after the first data set of the ancient poetry is obtained in the step S1, when the first data set is used to construct a first document, the method may be referred to in fig. 3, that is, the method includes: aiming at each poem, respectively collecting the poems according to names, author names, ages and contents, and connecting the poems according to a first fixed separator 10 to form block information; the block information further comprises block sequence information; and sequentially connecting a plurality of pieces of block information respectively corresponding to each poem according to a second fixed separator 40, and storing the block information in a text form to generate a first document. In the example of fig. 3, only the tile information 20 and the tile information 30 are shown, and in fact, the first document may link M ancient poems in the same manner.
Furthermore, in step S2, performing word frequency analysis on the first document to obtain a first list representing word frequency ordering, which may specifically include: performing word segmentation processing on the first document to obtain a keyword set; removing stop words from the keyword set, wherein the stop words at least comprise author names and times; and counting the word frequency in the keyword set to obtain a first list representing word frequency ordering.
Still further, establishing a first mapping table from the keywords in the first list to the names of the M poems in the first data set, including: establishing an index in the first document according to the keywords in the keyword set; acquiring block sequence information of the key words according to the indexes; and acquiring a first mapping table of the names of the keywords and the poems according to the block sequence information.
Preferably, the counting the word frequency in the keyword set includes: and performing cluster analysis on the keyword set, and generating a first list representing word frequency ordering based on a cluster analysis result. In this step, a cluster analysis is performed, which may be specifically by K-Means (K-Means) clustering, mean shift clustering, density-based clustering method (DBSCAN), maximal Expectation (EM) clustering with Gaussian Mixture Model (GMM), hierarchical clustering, or Graph Community Detection (Graph Community Detection).
And further, counting the word frequency in the keyword set, and removing the keywords with the word frequency smaller than a first preset value.
The ancient poetry word frequency analysis system provided by the invention comprises:
the data acquisition unit is configured to acquire a first data set comprising ancient poems and construct a first document according to the first data set, wherein the first data set at least comprises M poems;
the word frequency analysis unit is configured to perform word frequency analysis on the first document to obtain a first list representing word frequency sequencing, and according to the first list, establish a first mapping table from keywords in the first list to names of M poems in a first data set;
the information screening module is configured to remove virtual words in the first list to generate a second list according to virtual word information preset in a virtual word library, and update the first mapping table to form a second mapping table according to the second list; the second mapping table at least comprises classification information corresponding to poetry; screening at least one keyword which accords with preset conditions and has the highest word frequency sequence in a second list according to preset conditions of a user, and determining the names of N poems according to the corresponding relation between the keyword and the second mapping table;
the display unit is configured to respectively display poetry contents according to the names of the N poetry; m is larger than N, and M and N are both natural numbers.
Preferably, the poetry system further comprises a WebAPI interface, a cloud server and/or a storage unit, wherein the WebAPI interface is configured to obtain the first data set from a public API, and the cloud server and/or the storage unit is configured to store poetry information at least containing the first data set.
Preferably, the word frequency analyzing unit includes:
a word segmentation module configured to perform word segmentation processing on the first document to obtain a keyword set;
a stop word removing module configured to remove stop words from the keyword set, the stop words including at least author names and years;
and the word frequency counting module is configured to count the word frequency in the keyword set to obtain a first list representing word frequency ordering.
Various specific embodiments of the methods described above, including various software modules, may be implemented on the computer-readable storage media.
In the above, various operations or functions are described herein, which may be implemented as or defined as software code or instructions. Such content may be directly executable ("object" or "executable" form) source code or differential code ("delta" or "patch" code). Software implementations of embodiments described herein may be provided via an article of manufacture having code or instructions stored therein or via a method of operating a communication interface to transmit data via the communication interface. A machine or computer-readable storage medium may cause a machine to perform the functions or operations described, and includes any mechanism for storing information in a form accessible by a machine (e.g., a computing device, an electronic system, etc.), such as recordable/non-recordable media (e.g., Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media, optical storage media, flash memory devices, etc.). A communication interface includes any mechanism that interfaces to any of a hardwired, wireless, optical, etc. medium to communicate with another device, such as a memory bus interface, a processor bus interface, an internet connection, a disk controller, etc. The communication interface may be configured by providing configuration parameters and/or transmitting signals to prepare the communication interface to provide data signals describing the software content. The communication interface may be accessed via one or more commands or signals sent to the communication interface.
The present invention also relates to a system for performing the operations herein. The system may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CDROMs, and magnetic-optical disks, read-only memories (ROMs), Random Access Memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, each coupled to a computer system bus.
The above embodiments are only exemplary embodiments of the present invention, and are not intended to limit the present invention, and the scope of the present invention is defined by the claims. Various modifications and equivalents may be made by those skilled in the art within the spirit and scope of the present invention, and such modifications and equivalents should also be considered as falling within the scope of the present invention.

Claims (10)

1. The ancient poetry word frequency analysis method comprises the following steps:
acquiring a first data set comprising ancient poems, and constructing a first document according to the first data set, wherein the first data set at least comprises M poems;
performing word frequency analysis on the first document to obtain a first list representing word frequency sequencing, and establishing a first mapping table from keywords in the first list to names of M poems in a first data set according to the first list;
removing the virtual words in the first list to generate a second list according to virtual word information preset in a virtual word library, and updating the first mapping table to form a second mapping table according to the second list; the second mapping table at least comprises classification information corresponding to poetry;
screening at least one keyword which accords with preset conditions and has the highest word frequency sequence in a second list according to preset conditions of a user, and determining the names of N poems according to the corresponding relation between the keyword and the second mapping table;
respectively displaying poetry contents according to the names of the N poetry; m is larger than N, and M and N are both natural numbers.
2. The method of claim 1, wherein obtaining the first data set including ancient poems comprises obtaining pre-stored poem information from a local database, and/or obtaining pre-stored poem information from a cloud server, and/or obtaining the poem information through a WebAPI interface.
3. The method of claim 1, constructing a first document from the first data set, comprising:
respectively collecting each poem according to name, author name, age and content, and connecting according to a first fixed separator to form block information; the block information further comprises block sequence information;
and sequentially connecting a plurality of pieces of block information respectively corresponding to each poem according to a second fixed separator, and storing the block information in a text form to generate a first document.
4. The method of claim 1, performing a word frequency analysis on the first document to obtain a first list characterizing word frequency ordering, comprising:
performing word segmentation processing on the first document to obtain a keyword set;
removing stop words from the keyword set, wherein the stop words at least comprise author names and times;
and counting the word frequency in the keyword set to obtain a first list representing word frequency ordering.
5. The method of claim 1, establishing a first mapping table of keywords in the first list to names of M poems in the first data set, comprising:
establishing an index in the first document according to the keywords in the keyword set;
acquiring block sequence information of the key words according to the indexes;
and acquiring a first mapping table of the names of the keywords and the poems according to the block sequence information.
6. The method of claim 4, wherein counting word frequencies in the keyword set comprises:
performing a cluster analysis on the set of keywords,
a first list characterizing word frequency ordering is generated based on the cluster analysis results.
7. The method of claim 4, wherein the word frequency of the keyword set is counted, and further comprising removing the keywords with the word frequency less than a first predetermined value.
8. Ancient poetry word frequency analysis system includes:
the data acquisition unit is configured to acquire a first data set comprising ancient poems and construct a first document according to the first data set, wherein the first data set at least comprises M poems;
the word frequency analysis unit is configured to perform word frequency analysis on the first document to obtain a first list representing word frequency sequencing, and according to the first list, establish a first mapping table from keywords in the first list to names of M poems in a first data set;
the information screening module is configured to remove virtual words in the first list to generate a second list according to virtual word information preset in a virtual word library, and update the first mapping table to form a second mapping table according to the second list; the second mapping table at least comprises classification information corresponding to poetry; screening at least one keyword which accords with preset conditions and has the highest word frequency sequence in a second list according to preset conditions of a user, and determining the names of N poems according to the corresponding relation between the keyword and the second mapping table;
the display unit is configured to respectively display poetry contents according to the names of the N poetry; m is larger than N, and M and N are both natural numbers.
9. The system of claim 8, further comprising a WebAPI interface configured to obtain the first data set from a public API, a cloud server and/or a storage unit configured to store verse information including at least the first data set.
10. The system of claim 8, the term frequency analysis unit, comprising:
a word segmentation module configured to perform word segmentation processing on the first document to obtain a keyword set;
a stop word removing module configured to remove stop words from the keyword set, the stop words including at least author names and years;
and the word frequency counting module is configured to count the word frequency in the keyword set to obtain a first list representing word frequency ordering.
CN202110675786.7A 2021-06-18 2021-06-18 Ancient poetry frequency analysis method and system Active CN113420554B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110675786.7A CN113420554B (en) 2021-06-18 2021-06-18 Ancient poetry frequency analysis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110675786.7A CN113420554B (en) 2021-06-18 2021-06-18 Ancient poetry frequency analysis method and system

Publications (2)

Publication Number Publication Date
CN113420554A true CN113420554A (en) 2021-09-21
CN113420554B CN113420554B (en) 2023-10-27

Family

ID=77789031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110675786.7A Active CN113420554B (en) 2021-06-18 2021-06-18 Ancient poetry frequency analysis method and system

Country Status (1)

Country Link
CN (1) CN113420554B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040267722A1 (en) * 2003-06-30 2004-12-30 Larimore Stefan Isbein Fast ranked full-text searching
WO2007143914A1 (en) * 2006-06-02 2007-12-21 Beijing Sogou Technology Development Co., Ltd. Method, device and inputting system for creating word frequency database based on web information
CN108845987A (en) * 2018-06-01 2018-11-20 北京玄科技有限公司 A kind of poem search method and system based on semantic analysis
CN109033162A (en) * 2018-06-19 2018-12-18 深圳市元征科技股份有限公司 A kind of data processing method, server and computer-readable medium
CN110399385A (en) * 2019-06-24 2019-11-01 厦门市美亚柏科信息股份有限公司 A kind of semantic analysis and system for small data set
CN112307302A (en) * 2020-09-29 2021-02-02 青岛檬豆网络科技有限公司 New technology query recommendation method based on keyword extraction

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040267722A1 (en) * 2003-06-30 2004-12-30 Larimore Stefan Isbein Fast ranked full-text searching
WO2007143914A1 (en) * 2006-06-02 2007-12-21 Beijing Sogou Technology Development Co., Ltd. Method, device and inputting system for creating word frequency database based on web information
CN108845987A (en) * 2018-06-01 2018-11-20 北京玄科技有限公司 A kind of poem search method and system based on semantic analysis
CN109635295A (en) * 2018-06-01 2019-04-16 安徽省泰岳祥升软件有限公司 A kind of poem search method and system based on semantic analysis
CN109033162A (en) * 2018-06-19 2018-12-18 深圳市元征科技股份有限公司 A kind of data processing method, server and computer-readable medium
CN110399385A (en) * 2019-06-24 2019-11-01 厦门市美亚柏科信息股份有限公司 A kind of semantic analysis and system for small data set
CN112307302A (en) * 2020-09-29 2021-02-02 青岛檬豆网络科技有限公司 New technology query recommendation method based on keyword extraction

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LIU ZHE: "网络文学用词词频统计系统的研究与实现", 《HTTPS://WWW.DOCIN.COM/P-1861118138.HTML》 *
LIU ZHE: "网络文学用词词频统计系统的研究与实现", 《HTTPS://WWW.DOCIN.COM/P-1861118138.HTML》, 6 March 2017 (2017-03-06), pages 1 *
邱怡轩: "统计词话", 《HTTPS://COSX.ORG/2011/03/STATISTICS-IN-CHINESE-SONG-POEM-1》 *
邱怡轩: "统计词话", 《HTTPS://COSX.ORG/2011/03/STATISTICS-IN-CHINESE-SONG-POEM-1》, 4 March 2011 (2011-03-04) *

Also Published As

Publication number Publication date
CN113420554B (en) 2023-10-27

Similar Documents

Publication Publication Date Title
CN108268581A (en) The construction method and device of knowledge mapping
CN108304375A (en) A kind of information identifying method and its equipment, storage medium, terminal
CN103914494B (en) Method and system for identifying identity of microblog user
CN105068661A (en) Man-machine interaction method and system based on artificial intelligence
CN107885745A (en) A kind of song recommendations method and device
KR101100830B1 (en) Entity searching and opinion mining system of hybrid-based using internet and method thereof
US11720640B2 (en) Searching social media content
JP2015506515A (en) Method, apparatus and computer storage medium for automatically adding tags to a document
CN106874279A (en) Generate the method and device of applicating category label
CN105893478A (en) Tag extraction method and equipment
CN110012060A (en) Information-pushing method, device, storage medium and the server of mobile terminal
CN103186556B (en) Obtain the method with searching structure semantic knowledge and corresponding intrument
CN102314440B (en) Utilize the method and system in network operation language model storehouse
CN109739997A (en) Address control methods, apparatus and system
CN102831177A (en) Statement error correction method and system
CN109408821A (en) A kind of corpus generation method, calculates equipment and storage medium at device
CN106970991A (en) Recognition methods, device and the application searches of similar application recommend method, server
CN102930048A (en) Data abundance automatically found by semanteme and using reference and visual data
CN109902187A (en) A kind of construction method and device, terminal device of feature knowledge map
CN105874452A (en) Point of interest tagging from social feeds
US20240070395A1 (en) Utilizing sensor information to select a meaning of a word of a phrase
WO2023040516A1 (en) Event integration method and apparatus, and electronic device, computer-readable storage medium and computer program product
CN108304381B (en) Entity edge establishing method, device and equipment based on artificial intelligence and storage medium
CN116881429A (en) Multi-tenant-based dialogue model interaction method, device and storage medium
KR20120003834A (en) Entity searching and opinion mining system of hybrid-based using internet and method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant