CN110825792A - High-concurrency distributed data retrieval method based on golang middleware coroutine mode - Google Patents

High-concurrency distributed data retrieval method based on golang middleware coroutine mode Download PDF

Info

Publication number
CN110825792A
CN110825792A CN201911117727.7A CN201911117727A CN110825792A CN 110825792 A CN110825792 A CN 110825792A CN 201911117727 A CN201911117727 A CN 201911117727A CN 110825792 A CN110825792 A CN 110825792A
Authority
CN
China
Prior art keywords
data
golang
middleware
configuration
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911117727.7A
Other languages
Chinese (zh)
Other versions
CN110825792B (en
Inventor
苏学武
杨刚
赖冠
龚波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZHUHAI XINDEHUI INFORMATION TECHNOLOGY Co Ltd
Original Assignee
ZHUHAI XINDEHUI INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZHUHAI XINDEHUI INFORMATION TECHNOLOGY Co Ltd filed Critical ZHUHAI XINDEHUI INFORMATION TECHNOLOGY Co Ltd
Priority to CN201911117727.7A priority Critical patent/CN110825792B/en
Publication of CN110825792A publication Critical patent/CN110825792A/en
Application granted granted Critical
Publication of CN110825792B publication Critical patent/CN110825792B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a high-concurrency distributed data retrieval method based on a golang middleware coroutine mode, which comprises the following specific steps of: uploading data information to be acquired according to the provided user interaction page and the operation guide of the page; preprocessing the uploaded data information, and automatically adding indexes, types and type structures of the Elasticissearch; adding a timing acquisition task for data acquisition configuration on a configuration timer page; the system automatically schedules a data acquisition task executed through the golang middleware, and stores the data into an elastic search library; performing word segmentation and semantic analysis from the acquired data through an elastic search word segmentation and semantic analysis technology; and opening the data set to the user of the terminal by using the interface configuration page. The method introduces the golang co-project high concurrency technology, accelerates the data collecting and data arranging process to a certain extent, improves the collecting efficiency, and simultaneously adopts the technology of automatically removing repeated data, and improves the data utilization rate.

Description

High-concurrency distributed data retrieval method based on golang middleware coroutine mode
Technical Field
The invention relates to the technical field of database retrieval, in particular to a high-concurrency distributed data retrieval method based on a golang middleware coroutine mode, which is used for constructing resource synchronization of a public security database.
Background
Along with the rapid development of economy and science and technology in recent years, the informatization construction of the public security industry is also rapidly developed, but the problems of low data quality, poor processing capability, insufficient standard specification, insufficient sharing application, not deep professional application and the like are also accompanied. How to deal with the challenges brought by data resource quantification, isomerization, diversified and complicated application requirements and the like by means of technological strength is the key of information construction. However, the current situation of full-text search products is that each manufacturer is responsible for the product, and each manufacturer adopts different technical implementation schemes. The problems of data extraction and low efficiency of an external interface scheme appear due to the fact that a unified technical thought does not exist, and the situations that the interface is not universal, later-period maintenance is not timely and the like occur. Based on the above problems, the applicant compares and analyzes mainstream full-text search products in the existing market, and most of the full-text search products and the used technologies in the existing market have the following problems:
1. and retrieval function aspect: 1) the word hit rate is not high, and the category retrieval function is limited; 2) the word-cutting retrieval function is lacked; 3) the speed of taking information is far slower than the growth speed of network resources.
2. Data cleaning and data treatment: 1) data extraction confusion; 2) the data source is single, and the data storage mode is complex and slow and is not universal; 3) the unified technical thought is lacked, and the situations of low efficiency and non-universal interface exist in the external interface scheme.
3. In other aspects: 1) the compatibility is insufficient, and the method is only suitable for products with peripheral forms of the Internet; 2) the product has strong requirements on technical operation, is fussy to operate, and cannot provide a good application scene adaptive to diversity; 3) later maintenance is not timely, data updating is not timely, and the performance of data flow logs is lack, so that high requirements on hardware are required for tuning.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a high-concurrency distributed data retrieval method based on a golang middleware coroutine mode, and a set of simple and easy-to-use web configuration pages is designed and developed to solve the problems of single extraction data source, complex interactive interface design, complex and slow data storage mode and high data storage difficulty, so that the data acquisition efficiency and the data application efficiency are effectively improved, the later maintenance is ensured, and conditions are prepared for strengthening law enforcement regulations and improving the law enforcement efficiency.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows.
The high-concurrency distributed data retrieval method based on the golang middleware coroutine mode comprises the following specific steps of:
A. uploading data information to be acquired according to the provided user interaction page and an operation guide of the page, and then uploading configuration information of data acquisition;
B. b, preprocessing the data information uploaded in the step A to form a corresponding data structure rule, and automatically adding indexes, types and type structures of the elastic search;
C. after the collected data environment in the step B is arranged, according to the designed collected data configuration, adding a timing collection task for the collected data configuration on a configuration timer page;
D. c, configuring the timed task data in the step C, automatically scheduling the system to execute a data acquisition task through a golang middleware, and storing the data into an elastic search library;
E. when data enters an elastic search library, performing word segmentation and semantic analysis on the acquired data through word segmentation and semantic analysis technologies of the elastic search to obtain a final data set to be stored in a warehouse;
F. and opening the data set to the user of the terminal by using the interface configuration page.
Further optimizing the technical scheme, wherein the data information comprises text data and text data configuration information.
Further optimizing the technical scheme, wherein the data information comprises configuration database connection information and table information.
And B, further optimizing the technical scheme, wherein in the step B, the characteristic rule is used as a basis for page rendering, data sorting and data storage.
In the step B, the automatic addition of the index, type and type structure of the Elasticsearch is performed by adding a text directory or adding a database and a table in combination with a system background automation program according to a set of configured data structure mapping.
And C, further optimizing the technical scheme, wherein in the step C, the collected data environment is sorted by combining an automatic mode and a manual input configuration mode.
Further optimizing the technical scheme, wherein the step D comprises the following specific steps:
D1. landing data to be put in a database into a server local file through a golang code, storing a mapping relation between an input source and an output source in a program, and storing a related log;
D2. comparing the ground file with data in an index mapped by the Elasticissearch, filtering illegal data, screening out data needing to be put in a storage and storing the data in the storage into a memory;
D3. and importing the filtered data into an index of an Elasticissearch mapping through a high concurrency multiple protocol.
In the step D2, the data comparison is to classify and screen out the data mainly by using knn algorithm.
And E, further optimizing a technical scheme, wherein in the step E, the word segmentation and semantic analysis technology mainly uses a jieba word segmentation device to realize word segmentation by the following algorithm:
E1. realizing efficient word graph scanning based on a prefix dictionary, and generating a directed acyclic graph formed by all possible word forming conditions of Chinese characters in a sentence;
E2. a maximum probability path is searched by adopting dynamic programming, and a maximum segmentation combination based on word frequency is found out;
E3. for unknown words, an HMM model based on Chinese character word forming capability is adopted, a Viterbi algorithm is used, and pinyin is converted into Chinese characters and characters are segmented through a large number of real data.
Due to the adoption of the technical scheme, the technical progress of the invention is as follows.
The invention realizes the import of various data into the database by adopting a visual mode to form corresponding rules, the defined rules can be used as the basis of page rendering, data arrangement and data storage, and the acquisition method comprising a configuration mode ensures that business personnel can complete transverse expansion through the provided configuration function under the condition of not needing participation of developers, thereby meeting the acquisition requirements of various data sources, simultaneously reducing the workload of the developers to a certain extent and reducing the coupling degree of codes. The method effectively solves the problems of single extraction data source, complex design of an interactive interface, complex and slow data storage mode and large data storage difficulty, effectively improves the data acquisition efficiency and the data application efficiency, ensures timely maintenance in the later period, and prepares conditions for strengthening law enforcement standards and improving the law enforcement efficiency.
The invention combines the configuration design and the relational database application to realize the acquisition of various data of heterogeneous data sources and ensure the robustness and the robustness of the acquisition method.
The method introduces the golang co-project high concurrency technology, accelerates the data collecting and data arranging process to a certain extent, improves the collecting efficiency, and simultaneously adopts the technology of automatically removing repeated data, and improves the data utilization rate.
The invention adopts the word segmentation technology and the semantic analysis technology to carry out deep analysis on the extracted text information so as to extract element information with higher data value, provide data support for realizing more subsequent upper-layer applications, fully exert data efficiency and provide assistance for automatic data arrangement and manual data arrangement of the acquisition method.
Drawings
FIG. 1 is a general flow diagram of the present invention.
Detailed Description
The invention will be described in further detail below with reference to the figures and specific examples.
A high concurrency distributed data retrieval method based on a golang middleware coroutine mode is characterized in that function development is carried out by combining characteristics of the golang, the high concurrency and the multiprotocol advantage of the golang language can be exerted, data processing and ES (electronic storage) importing are carried out, and full-text retrieval can be achieved. The high concurrency is one of factors which must be considered in the architecture design of the internet distributed system, and generally means that the system can simultaneously process a plurality of requests in parallel by design; the execution of the coroutine only needs 2kb of memory, thousands of concurrent tasks can be simultaneously operated, and the occupied memory is small. The cluster processing of more than 4 ten thousand processed data per second and large rules can be realized, and more than 10 ten thousand processed data per second can be realized.
The high-concurrency distributed data retrieval method based on the golang middleware coroutine mode is shown in the combined figure 1 and comprises the following specific steps:
A. and uploading data information to be acquired according to the provided user interaction page and the operation guide of the page, and then uploading configuration information of data acquisition. The data information includes text data and text data configuration information, or the data information includes configuration database connection information and table information.
B. And B, preprocessing the data information uploaded in the step A to form a corresponding data structure rule, and automatically adding the index, the type and the type structure of the Elasticissearch. Data structures are the way computers store, organize, etc. data. A data structure refers to a collection of data elements that have one or more specific relationships to each other.
The automatic addition of the index, the type and the type structure of the Elasticissearch is carried out by combining a background automation program of the system to add a text directory or add a database and a table and mapping according to a set of configured data structures. Database type in this step: oracle \ mysql \ postgresql.
C. And C, finishing the arrangement of the acquired data environment in the step B by combining an automatic mode and a manual input configuration mode, and adding a timing acquisition task for the acquired data configuration on a configuration timer page according to the designed acquired data configuration.
Automation means that a system automatically builds a data structure and automatically synchronizes data. The manual input configuration refers to manual configuration of a data source and data scheduling.
D. And C, configuring the timing task data in the step C, automatically scheduling the system to execute a data acquisition task through the golang middleware, and storing and merging the data into an elastic search library.
The step D comprises the following specific steps:
D1. and landing the data to be put in storage into a local file of the server through a golang code, storing the mapping relation between an input source and an output source in a program, and storing a related log.
D2. And comparing the floor file with data in the index mapped by the Elasticissearch, filtering illegal data, screening out data needing to be put in a storage and storing the data in the storage.
Illegal data refers to data with abnormal format and data value exceeding the set normal range.
The data comparison refers to that a database data set is inquired in the glong middleware protocol and compared with an elastic search data set through configured keywords, and the data comparison is mainly realized by adopting an knn algorithm, so that repeated data are automatically subjected to deduplication and sorting.
knn is a basic classification and regression method, which has the rule that samples of the same class are gathered in a feature space, and the data can be classified and screened by the algorithm.
D3. And importing the filtered data into an index of an Elasticissearch mapping through a high concurrency multiple protocol.
E. When data enters the elastic search library, word segmentation and semantic analysis are carried out on the collected data through word segmentation and semantic analysis technologies of the elastic search, more valuable information is extracted from the collected data, and the search hit rate and the search speed are improved. And obtaining a final data set to be put in storage.
The word segmentation and semantic analysis technology mainly uses a jieba word segmentation device to realize word segmentation by the following algorithm:
E1. efficient word graph scanning is achieved based on the prefix dictionary, and a Directed Acyclic Graph (DAG) formed by all possible word forming conditions of Chinese characters in the sentence is generated.
E2. And searching a maximum probability path by adopting dynamic programming, and finding out a maximum segmentation combination based on the word frequency.
E3. For unknown words, in order to convert pinyin into Chinese characters and divide characters into words, an HMM model is adopted, a Viterbi algorithm is used, the optimal result is calculated by the algorithm through a large number of real data, and the algorithm principle is as follows:
the probability distribution of each state St in the random process is only related to its previous state St-1, i.e. P (St | S1, S2, S3, …, St-1) ═ P (St | St-1).
The steps of the viterbi algorithm are summarized as follows:
if the most probable path p (or shortest path) passes through a certain point, such as X22 on the way, the starting point S on this path to the sub-path Q of X22 must be the shortest path between S and X22. Otherwise, replacing Q with the shortest path R from S to X22 constitutes a shorter path than P, which is clearly contradictory. The principle of satisfaction of optimality is demonstrated.
The path from S to E must pass through a certain state at the ith time, and assuming that there are k states at the ith time, if the shortest paths of all k nodes from S to the ith state are recorded, the final shortest path must pass through one of them, so that at any time, only the very limited shortest path is considered.
In connection with the above two points, assuming that when we enter the state i +1 from the state i, the shortest paths from S to the nodes on the state i are found and recorded on the nodes, then when calculating the shortest path from the starting point S to a certain node Xi +1 of the i +1 th state, it is only necessary to consider the shortest paths from S to all k nodes of the previous state i and the distance from the node to Xi +1, j.
F. The data set can be opened to the user of the terminal for use by using the interface configuration page.
The method introduces the golang co-project high concurrency technology, accelerates the data collecting and data arranging process to a certain extent, improves the collecting efficiency, and simultaneously adopts the technology of automatically removing repeated data, and improves the data utilization rate.
The invention adopts the word segmentation technology and the semantic analysis technology to carry out deep analysis on the extracted text information so as to extract element information with higher data value, provide data support for realizing more subsequent upper-layer applications, fully exert data efficiency and provide assistance for automatic data arrangement and manual data arrangement of the acquisition method.

Claims (9)

1. The high-concurrency distributed data retrieval method based on the golang middleware coroutine mode is characterized by comprising the following specific steps of:
A. uploading data information to be acquired according to the provided user interaction page and an operation guide of the page, and then uploading configuration information of data acquisition;
B. b, preprocessing the data information uploaded in the step A to form a corresponding data structure rule, and automatically adding indexes, types and type structures of the elastic search;
C. after the collected data environment in the step B is arranged, according to the designed collected data configuration, adding a timing collection task for the collected data configuration on a configuration timer page;
D. c, configuring the timed task data in the step C, automatically scheduling the system to execute a data acquisition task through a golang middleware, and storing the data into an elastic search library;
E. when data enters an elastic search library, performing word segmentation and semantic analysis on the acquired data through word segmentation and semantic analysis technologies of the elastic search to obtain a final data set to be stored in a warehouse;
F. and opening the data set to the user of the terminal by using the interface configuration page.
2. The method for highly concurrent distributed data retrieval based on the golang middleware coroutine mode as claimed in claim 1, wherein the data information comprises text data and text data configuration information.
3. The method for highly concurrent distributed data retrieval based on the golang middleware coroutine mode as claimed in claim 1, wherein the data information comprises configuration database connection information and table information.
4. The method for highly concurrent distributed data retrieval based on the golang middleware coroutine mode as claimed in claim 1, wherein in the step B, the feature rules are used as the basis for page rendering, data arrangement and data storage.
5. The highly concurrent distributed data retrieval method based on the golang middleware coroutine mode as claimed in claim 1, wherein in the step B, the automatic addition of the index, type and type structure of the Elasticsearch is performed by adding a text directory or adding a database and a table in combination with a system background automation program according to a set of configured data structure mapping.
6. The method for highly concurrent distributed data retrieval based on the golang middleware coroutine mode as claimed in claim 1, wherein in the step C, the collected data environment is arranged by combining automation and manual input configuration.
7. The method for highly concurrent distributed data retrieval based on the golang middleware coroutine mode as claimed in claim 1, wherein the step D comprises the following specific steps:
D1. landing data to be put in a database into a server local file through a golang code, storing a mapping relation between an input source and an output source in a program, and storing a related log;
D2. comparing the ground file with data in an index mapped by the Elasticissearch, filtering illegal data, screening out data needing to be put in a storage and storing the data in the storage into a memory;
D3. and importing the filtered data into an index of an Elasticissearch mapping through a high concurrency multiple protocol.
8. The method for highly concurrent distributed data retrieval based on the golang middleware assistant program mode as claimed in claim 7, wherein in the step D2, the data comparison is to mainly use knn algorithm to classify and screen out the data.
9. The high-concurrency distributed data retrieval method based on the golang middleware coroutine mode as claimed in claim 1, wherein in said step E, the word segmentation and semantic analysis technique mainly uses a jieba word segmenter to implement word segmentation by the following algorithm:
E1. realizing efficient word graph scanning based on a prefix dictionary, and generating a directed acyclic graph formed by all possible word forming conditions of Chinese characters in a sentence;
E2. a maximum probability path is searched by adopting dynamic programming, and a maximum segmentation combination based on word frequency is found out;
E3. for unknown words, an HMM model based on Chinese character word forming capability is adopted, a Viterbi algorithm is used, and pinyin is converted into Chinese characters and characters are segmented through a large number of real data.
CN201911117727.7A 2019-11-15 2019-11-15 High concurrency distributed data retrieval method based on golang middleware cooperative mode Active CN110825792B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911117727.7A CN110825792B (en) 2019-11-15 2019-11-15 High concurrency distributed data retrieval method based on golang middleware cooperative mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911117727.7A CN110825792B (en) 2019-11-15 2019-11-15 High concurrency distributed data retrieval method based on golang middleware cooperative mode

Publications (2)

Publication Number Publication Date
CN110825792A true CN110825792A (en) 2020-02-21
CN110825792B CN110825792B (en) 2024-06-07

Family

ID=69555567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911117727.7A Active CN110825792B (en) 2019-11-15 2019-11-15 High concurrency distributed data retrieval method based on golang middleware cooperative mode

Country Status (1)

Country Link
CN (1) CN110825792B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111782396A (en) * 2020-07-01 2020-10-16 浪潮云信息技术股份公司 Concurrency flexible control method based on distributed database
CN111814142A (en) * 2020-06-29 2020-10-23 上海三零卫士信息安全有限公司 Big data rapid threat detection system based on OpenIOC

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678684A (en) * 2013-12-25 2014-03-26 沈阳美行科技有限公司 Chinese word segmentation method based on navigation information retrieval
CN109358956A (en) * 2018-09-30 2019-02-19 上海保险交易所股份有限公司 Service calling method
CN109582551A (en) * 2018-10-11 2019-04-05 平安科技(深圳)有限公司 Daily record data analytic method, device, computer equipment and storage medium
CN109739727A (en) * 2019-01-03 2019-05-10 优信拍(北京)信息科技有限公司 Service monitoring method and device in micro services framework

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678684A (en) * 2013-12-25 2014-03-26 沈阳美行科技有限公司 Chinese word segmentation method based on navigation information retrieval
CN109358956A (en) * 2018-09-30 2019-02-19 上海保险交易所股份有限公司 Service calling method
CN109582551A (en) * 2018-10-11 2019-04-05 平安科技(深圳)有限公司 Daily record data analytic method, device, computer equipment and storage medium
CN109739727A (en) * 2019-01-03 2019-05-10 优信拍(北京)信息科技有限公司 Service monitoring method and device in micro services framework

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
WEYLAU: "Go ⽤ 500 ⾏ Golang 代码实现⾼性能的消息回调中间件", pages 1 - 12 *
余昌发等: "基于Kubernetes的分布式TensorFlow平台的设计与实现", 计算机科学, no. 2, 15 November 2018 (2018-11-15) *
罗东锋;李芳;郝汪洋;吴仲城;: "基于Docker的大规模日志采集与分析系统", no. 10, pages 82 - 88 *
罗东锋等: "基于Docker 的大规模日志采集与分析系统", pages 82 - 88 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814142A (en) * 2020-06-29 2020-10-23 上海三零卫士信息安全有限公司 Big data rapid threat detection system based on OpenIOC
CN111782396A (en) * 2020-07-01 2020-10-16 浪潮云信息技术股份公司 Concurrency flexible control method based on distributed database

Also Published As

Publication number Publication date
CN110825792B (en) 2024-06-07

Similar Documents

Publication Publication Date Title
RU2632133C2 (en) Method (versions) and system (versions) for creating prediction model and determining prediction model accuracy
CN109241266B (en) Method and device for creating extended question based on standard question in man-machine interaction
EP3671526B1 (en) Dependency graph based natural language processing
JP5092165B2 (en) Data construction method and system
CN111460798A (en) Method and device for pushing similar meaning words, electronic equipment and medium
WO2013170587A1 (en) Multimedia question and answer system and method
Cruz et al. A literature review and comparison of three feature location techniques using argouml-spl
KR102345410B1 (en) Big data intelligent collecting method and device
Anderson et al. An intelligent online grooming detection system using AI technologies
CN112667735A (en) Visualization model establishing and analyzing system and method based on big data
CN104536830A (en) KNN text classification method based on MapReduce
CN110825792B (en) High concurrency distributed data retrieval method based on golang middleware cooperative mode
Papanikolaou et al. Protest event analysis: A longitudinal analysis for Greece
KR20220095654A (en) Social data collection and analysis system
CN104462552A (en) Question and answer page core word extracting method and device
CN116304347A (en) Git command recommendation method based on crowd-sourced knowledge
KR20200000208A (en) Social data collection analysis system and method
Kühl et al. Automatically quantifying customer need tweets: Towards a supervised machine learning approach
KR101718599B1 (en) System for analyzing social media data and method for analyzing social media data using the same
CN114841155A (en) Intelligent theme content aggregation method and device, electronic equipment and storage medium
JP6081609B2 (en) Data analysis system and method
CN112926328A (en) Method for disambiguating applicant company name in patent data
CN117909556B (en) File data processing method, device, equipment and storage medium
EP3944127A1 (en) Dependency graph based natural language processing
CN106934002B (en) Search keyword digitalized analysis method and engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant