CN116244486A - Crawling data processing method and system based on data stream - Google Patents

Crawling data processing method and system based on data stream Download PDF

Info

Publication number
CN116244486A
CN116244486A CN202310244348.4A CN202310244348A CN116244486A CN 116244486 A CN116244486 A CN 116244486A CN 202310244348 A CN202310244348 A CN 202310244348A CN 116244486 A CN116244486 A CN 116244486A
Authority
CN
China
Prior art keywords
data
item
crawling
items
pipeline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310244348.4A
Other languages
Chinese (zh)
Inventor
程宇浩
王丹琛
万振华
王颉
李华
董燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seczone Technology Co Ltd
Original Assignee
Seczone Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seczone Technology Co Ltd filed Critical Seczone Technology Co Ltd
Priority to CN202310244348.4A priority Critical patent/CN116244486A/en
Publication of CN116244486A publication Critical patent/CN116244486A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a crawling data processing method and system based on data flow, wherein the method comprises the following steps: crawling target data based on the key information strip to generate a plurality of data items, and transmitting the data items to a first data pipeline; receiving data items through a first data pipeline, inputting the data items into a corresponding data cleaning function according to the types of the data items for cleaning, and transmitting the cleaned data items meeting the requirements to a second data pipeline; creating a plurality of data entry functions of different types, receiving data items through a second data pipeline, inputting the data items into corresponding data entry functions according to the types of the data items for entry query processing, and updating the database according to query processing results; the data processing mode has clear logic structure, is convenient to expand, can realize the quick construction of one data acquisition item, and is not easy to make mistakes when the data meeting the conditions are stored in the database.

Description

Crawling data processing method and system based on data stream
Technical Field
The present invention relates to the field of crawling data processing technologies, and in particular, to a crawling data processing method and system based on data flow.
Background
With the development of artificial intelligence technology, more and more functions require a large amount of data as support. While a significant portion of enterprise users employ crawler tools to collect data and analyze the data using big data. Crawler technology is used for capturing data from web pages or equipment information and other places through certain rules and methods. But the quality of the data collected by the crawler tool is far from meeting the requirements of being able to be used, so the data needs to be subjected to a large number of cleaning and warehousing procedures. Often, an enterprise or an item needs to collect data information of tens or hundreds of dimensions, so a multitasking data collection program generally adopts a parallel processing mode in the prior art, that is, each processing module (such as cleaning, warehousing and the like) is independent of each other, each processing module is in communication connection with a database, any processing module is used for placing target data in the database after completing tasks of the processing module, for example, after a cleaning module takes out data from the database and cleans the data, the data meeting requirements is placed in the database, and a database entering module extracts the data placed in the cleaning module from the database for processing. For the traditional data crawling processing mode, when the data types are multiple, the frame construction work for data acquisition and processing is huge, the management difficulty is also high, logic confusion is easy to occur, and repeated warehouse entry is easy to cause.
Disclosure of Invention
The invention aims to provide a crawling data processing method and system based on data flow, which can quickly build a data acquisition and processing program framework, has a clear logic structure and is not easy to make mistakes.
In order to achieve the above object, the present invention discloses a crawling data processing method based on data flow, which includes:
creating a plurality of key information strips which respectively belong to different dimensions and are used for crawling data;
crawling target data based on the key information bar by adopting a crawler tool to generate a plurality of data items, wherein each data Item comprises one Item of target data, and transmitting the data Item to a first data pipeline;
creating a plurality of different types of data cleaning functions;
receiving the data Item through the first data pipeline, inputting the data Item into a corresponding data cleaning function according to the category of the data Item for cleaning, and transmitting the cleaned data Item meeting the requirement to a second data pipeline;
creating a plurality of data warehouse-in functions of different types;
and receiving the data Item through the second data pipeline, inputting the data Item into a corresponding data warehousing function according to the category of the data Item for warehousing query processing, and updating a database according to the query processing result.
Preferably, each key information bar includes a plurality of data fields, field names of data fields representing the same content in the key information bars of different dimensions are the same, and table names and table unique indexes corresponding to each key information bar are integrated in the same information table to perform unified management.
Preferably, when a data Item is newly added into a database, all data items of the same category of the newly added data Item in the database are integrally ordered.
Preferably, the method for overall ordering the data items comprises the following steps:
when the data Item is processed through the data warehousing function, transmitting the data Item meeting the warehousing condition to a third data pipeline;
receiving the data Item from the third data pipeline by adopting a data marking function, marking the data Item, and writing the characteristic name of the data Item into redis;
and reading the corresponding feature names from the Redis by adopting a data sorting function, and sorting the marks of the similar data items in the database based on the read feature names.
The invention also discloses a crawling data processing system based on the data stream, which comprises:
the data preparation module is used for creating a plurality of key information strips which respectively belong to different dimensions and are used for crawling data;
the data acquisition module is used for crawling target data based on the key information bar by adopting a crawler tool to generate a plurality of data items, wherein each data Item comprises one Item of target data, and the data items are transmitted to a first data pipeline;
the data cleaning module is used for creating a plurality of data cleaning functions of different types, receiving the data items through the first data pipeline, inputting the data items into the corresponding data cleaning functions according to the types of the data items for cleaning, and transmitting the cleaned data items meeting the requirements to the second data pipeline;
the data warehouse-in module is used for creating a plurality of data warehouse-in functions of different types, receiving the data Item through the second data pipeline, inputting the data Item into the corresponding data warehouse-in function according to the category of the data Item for warehouse-in query processing, and updating the database according to the query processing result.
Preferably, each key information bar includes a plurality of data fields, and field names of data fields representing the same content in the key information bars of different dimensions are the same, and the data preparation module further integrates a table name and a table unique index corresponding to each key information bar into the same information table for unified management.
Preferably, the system further comprises a data post-processing module, wherein the data post-processing module is used for integrally sequencing all data items of the same category of the data Item newly added in the database when the data Item is newly added in the database.
Preferably, the data post-processing module comprises a marking module and a sorting module; the marking module is used for receiving the data Item from the third data pipeline by adopting a data marking function, marking the data Item, and writing the characteristic name of the data Item into redis; the third data pipeline is used for receiving data items meeting the warehousing conditions; the sorting module is used for reading the corresponding feature names from the Redis by adopting a data sorting function and sorting the marks of the similar data items in the database based on the read feature names.
The invention also discloses another crawling data processing system based on the data stream, which comprises:
one or more processors;
a memory;
and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs including instructions for performing the data stream based crawling data processing method as described above.
The invention also discloses a computer readable storage medium comprising a computer program executable by a processor to perform a data stream based crawling data processing method as described above.
Compared with the prior art, the technical scheme of the invention designs the framework of the processing program for processing the crawling data by using the thought of the data flow, namely, each processing flow is connected in series, and only the last warehousing flow is connected with the database in a communication way, so that the data to be processed flows into the cleaning stage from the collecting stage and flows into the warehousing stage from the cleaning stage in a sequential flow mode, and finally the data meeting the requirements is updated to the database in the warehousing stage; therefore, the data processing mode has clear logic structure, is convenient to expand, can realize the quick construction of one data acquisition item, and is not easy to make mistakes when the data meeting the conditions are stored in the database.
Drawings
Fig. 1 is a schematic diagram of a crawling data processing method in an embodiment of the present invention.
FIG. 2 is a flowchart of a method for crawling data processing in an embodiment of the present invention.
Detailed Description
In order to describe the technical content, the constructional features, the achieved objects and effects of the present invention in detail, the following description is made in connection with the embodiments and the accompanying drawings.
The embodiment discloses a crawling data processing method based on data flow, which is used for crawling data from web pages or other equipment and other places through a crawler tool. Specifically, as shown in fig. 1 and 2, the data processing method includes:
s1, a data preparation stage: according to project requirements, creating a plurality of key information strips which respectively belong to different dimensions and are used for crawling data;
s2, entering a data acquisition stage: crawling target data from a target webpage or other equipment based on the key information bar by adopting a crawler tool to generate a plurality of data items (namely data containers), wherein each data Item comprises one Item of target data, and transmitting the data Item to a first data pipeline;
s3, entering a data cleaning stage: firstly, creating a plurality of data cleaning functions of different types;
s4, receiving the data Item through the first data pipeline, inputting the data Item into a corresponding data cleaning function according to the category of the data Item for cleaning, and transmitting the cleaned data Item meeting the requirement to a second data pipeline;
s5, entering a data warehouse-in stage: firstly, creating a plurality of data warehouse-in functions of different types;
s6, receiving the data Item through the second data pipeline, inputting the data Item into a corresponding data warehousing function according to the category of the data Item to carry out warehousing query processing, and updating a database according to the query processing result. That is, whether the same object as the data in the current data Item exists is queried in the database, if not, the creation time and the update time are initialized for the current database, and then the data insertion operation is performed; if the data object exists, updating the field values of the new data and the old data, if the field values are inconsistent, directly skipping, generating an updated dictionary, adding the updated data, and then performing data updating operation.
In the data processing method in this embodiment, the framework of the processing procedure for processing the crawl data is designed by using the idea of data flow, that is, as shown in fig. 1, each processing procedure is connected in series, and only the last warehousing procedure is connected with the database in a communication manner, so that the data to be processed flows from the acquisition stage to the cleaning stage, flows from the cleaning stage to the warehousing stage, and finally updates the data meeting the requirements to the database in the warehousing stage. Therefore, the data processing mode has clear logic structure, is convenient to expand, can realize the quick construction of one data acquisition item, and is not easy to make mistakes when the data meeting the conditions are stored in the database.
Further, each key information bar includes a plurality of data fields, field names of the data fields representing the same content (such as release time) in the key information bars with different dimensions are the same, and a table name and a table unique index corresponding to each key information bar are integrated in the same information table so as to perform unified management and facilitate subsequent unified call.
Furthermore, the data processing method in this embodiment further includes a data post-processing stage, that is, when a database has a new data Item added into the database, the data items of the same class as the newly added data Item in the database are integrally ordered, so as to facilitate subsequent calls.
Specifically, the method for overall ordering the data items includes:
firstly, when the data Item is processed through the data warehousing function, transmitting the data Item meeting the warehousing condition to a third data pipeline;
then, the data Item is received from the third data pipeline by adopting a data marking function, the data Item is marked, and the characteristic name of the data Item is written into redis (remote dictionary service, which is an open source log-type, key-Value database written by ANSI C language, supports network, can be based on memory and can be persistent and provides APIs of multiple languages); for example, if the data stored in the database is component A and the version number is 1.0, the feature name "component A" is written into redis;
and then, reading the corresponding feature names from the Redis by adopting a data sorting function, and sorting the marks of the similar data items in the database based on the read feature names. For example, if "component a" is read, all the data of component a are queried in the database, if three data are queried, namely, component a (version 1.0), component a (version 2.0), component a (version 3.0), wherein component a (version 1.0) and component a (version 2.0) are the existing data, the sequence number of the tag of component a (version 1.0) is 2, the sequence number of the tag of component a (version 2.0) is 1 (representing the latest), then, but after component a (version 3.0) enters, the sequence number of the tag of component a (version 1.0) is 3, the sequence number of the tag of component a (version 2.0) is 2, and the sequence number of the tag of component a (version 3.0) is 1 (representing the latest) through the processing of the data sorting function.
In another preferred embodiment of the present invention, a crawling data processing system based on data flow is also disclosed, which includes the following functional modules:
the data preparation module is used for creating a plurality of key information strips which respectively belong to different dimensions and are used for crawling data;
the data acquisition module is used for crawling target data based on the key information bar by adopting a crawler tool to generate a plurality of data items, wherein each data Item comprises one Item of target data, and the data items are transmitted to a first data pipeline;
the data cleaning module is used for creating a plurality of data cleaning functions of different types, receiving the data items through the first data pipeline, inputting the data items into the corresponding data cleaning functions according to the types of the data items for cleaning, and transmitting the cleaned data items meeting the requirements to the second data pipeline;
the data warehouse-in module is used for creating a plurality of data warehouse-in functions of different types, receiving the data Item through the second data pipeline, inputting the data Item into the corresponding data warehouse-in function according to the category of the data Item for warehouse-in query processing, and updating the database according to the query processing result.
Further, each key information bar includes a plurality of data fields, field names of data fields representing the same content in the key information bars of different dimensions are the same, and the data preparation module integrates a table name and a table unique index corresponding to each key information bar into the same information table so as to perform unified management.
Furthermore, the processing system in this embodiment further includes a data post-processing module, where the data post-processing module is configured to, when a database has a new data Item added in the database, perform overall sorting on all data items in the same category as the newly added data Item in the database.
Specifically, the data post-processing module comprises a marking module and a sorting module; the marking module is used for receiving the data Item from the third data pipeline by adopting a data marking function, marking the data Item, and writing the characteristic name of the data Item into redis; the third data pipeline is used for receiving data items meeting the warehousing conditions; the sorting module is used for reading the corresponding feature names from the Redis by adopting a data sorting function and sorting the marks of the similar data items in the database based on the read feature names.
The present invention also discloses another data stream based crawling data processing system comprising one or more processors, a memory and one or more programs, wherein one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the data stream based crawling data processing method as described above. The processor may employ a general-purpose central processing unit (Central Processing Unit, CPU), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits for executing associated programs to perform the functions required to be performed by the modules in the data flow based crawling data processing system of the embodiments of the present application or to perform the data flow based crawling data processing method of the embodiments of the present application.
The invention also discloses a computer readable storage medium comprising a computer program executable by a processor to perform a data stream based crawling data processing method as described above. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a read-only memory (ROM), or a random-access memory (random access memory, RAM), or a magnetic medium, for example, a floppy disk, a hard disk, a magnetic tape, a magnetic disk, or an optical medium, for example, a digital versatile disk (digital versatile disc, DVD), or a semiconductor medium, for example, a Solid State Disk (SSD), or the like.
The present application also discloses a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the electronic device to perform the data stream based crawling data processing method described above.
The foregoing description of the preferred embodiments of the present invention is not intended to limit the scope of the claims, which follow, as defined in the claims.

Claims (10)

1. A method for crawling data processing based on a data stream, comprising:
creating a plurality of key information strips which respectively belong to different dimensions and are used for crawling data;
crawling target data based on the key information bar by adopting a crawler tool to generate a plurality of data items, wherein each data Item comprises one Item of target data, and transmitting the data Item to a first data pipeline;
creating a plurality of different types of data cleaning functions;
receiving the data Item through the first data pipeline, inputting the data Item into a corresponding data cleaning function according to the category of the data Item for cleaning, and transmitting the cleaned data Item meeting the requirement to a second data pipeline;
creating a plurality of data warehouse-in functions of different types;
and receiving the data Item through the second data pipeline, inputting the data Item into a corresponding data warehousing function according to the category of the data Item for warehousing query processing, and updating a database according to the query processing result.
2. The method according to claim 1, wherein each key information bar includes a plurality of data fields, and field names of data fields representing the same content in the key information bars of different dimensions are the same, and a table name and a table unique index corresponding to each key information bar are integrated in the same information table for unified management.
3. The crawling data processing method based on data flow according to claim 1, characterized in that when a new data Item is added in a database, all data items of the same category of the newly added data Item in the database are integrally ordered.
4. A method of data flow based crawling data processing as claimed in claim 3, wherein the method of overall ordering said data items comprises:
when the data Item is processed through the data warehousing function, transmitting the data Item meeting the warehousing condition to a third data pipeline;
receiving the data Item from the third data pipeline by adopting a data marking function, marking the data Item, and writing the characteristic name of the data Item into redis;
and reading the corresponding feature names from the Redis by adopting a data sorting function, and sorting the marks of the similar data items in the database based on the read feature names.
5. A data flow based crawling data processing system, comprising:
the data preparation module is used for creating a plurality of key information strips which respectively belong to different dimensions and are used for crawling data;
the data acquisition module is used for crawling target data based on the key information bar by adopting a crawler tool to generate a plurality of data items, wherein each data Item comprises one Item of target data, and the data items are transmitted to a first data pipeline;
the data cleaning module is used for creating a plurality of data cleaning functions of different types, receiving the data items through the first data pipeline, inputting the data items into the corresponding data cleaning functions according to the types of the data items for cleaning, and transmitting the cleaned data items meeting the requirements to the second data pipeline;
the data warehouse-in module is used for creating a plurality of data warehouse-in functions of different types, receiving the data Item through the second data pipeline, inputting the data Item into the corresponding data warehouse-in function according to the category of the data Item for warehouse-in query processing, and updating the database according to the query processing result.
6. The system of claim 5, wherein each key information item includes a plurality of data fields, and the fields of the data fields representing the same content in the key information items of different dimensions are the same, and the data preparation module further integrates the table name and the table unique index corresponding to each key information item into the same information table for unified management.
7. The crawling data processing system based on data flow of claim 5, further comprising a data post-processing module, wherein the data post-processing module is configured to, when a database has a newly added data Item in the database, perform overall sorting on all data items in the same category as the newly added data Item in the database.
8. The data stream based crawling data processing system of claim 7, wherein said data post-processing module comprises a tagging module and a ranking module; the marking module is used for receiving the data Item from the third data pipeline by adopting a data marking function, marking the data Item, and writing the characteristic name of the data Item into redis; the third data pipeline is used for receiving data items meeting the warehousing conditions; the sorting module is used for reading the corresponding feature names from the Redis by adopting a data sorting function and sorting the marks of the similar data items in the database based on the read feature names.
9. A data flow based crawling data processing system, comprising:
one or more processors;
a memory;
and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the data flow based crawling data processing method of any of claims 1-4.
10. A computer readable storage medium comprising a computer program executable by a processor to perform the data stream based crawling data processing method of any of claims 1 to 4.
CN202310244348.4A 2023-03-06 2023-03-06 Crawling data processing method and system based on data stream Pending CN116244486A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310244348.4A CN116244486A (en) 2023-03-06 2023-03-06 Crawling data processing method and system based on data stream

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310244348.4A CN116244486A (en) 2023-03-06 2023-03-06 Crawling data processing method and system based on data stream

Publications (1)

Publication Number Publication Date
CN116244486A true CN116244486A (en) 2023-06-09

Family

ID=86633063

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310244348.4A Pending CN116244486A (en) 2023-03-06 2023-03-06 Crawling data processing method and system based on data stream

Country Status (1)

Country Link
CN (1) CN116244486A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110251878A1 (en) * 2010-04-13 2011-10-13 Yahoo! Inc. System for processing large amounts of data
CN105069117A (en) * 2015-08-11 2015-11-18 国网技术学院 Data flow efficiency improving method based on storage process
CN110781368A (en) * 2019-10-22 2020-02-11 北京赛时科技有限公司 Information crawling system and method for specified experts
CN112597373A (en) * 2020-12-29 2021-04-02 科技谷(厦门)信息技术有限公司 Data acquisition method based on distributed crawler engine

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110251878A1 (en) * 2010-04-13 2011-10-13 Yahoo! Inc. System for processing large amounts of data
CN105069117A (en) * 2015-08-11 2015-11-18 国网技术学院 Data flow efficiency improving method based on storage process
CN110781368A (en) * 2019-10-22 2020-02-11 北京赛时科技有限公司 Information crawling system and method for specified experts
CN112597373A (en) * 2020-12-29 2021-04-02 科技谷(厦门)信息技术有限公司 Data acquisition method based on distributed crawler engine

Similar Documents

Publication Publication Date Title
CN109669933B (en) Transaction data intelligent processing method and device and computer readable storage medium
CN105138312B (en) A kind of table generation method and device
CN104391978A (en) Method and device for storing and processing web pages of browsers
CN115269515B (en) Processing method for searching specified target document data
CN104346331A (en) Retrieval method and system for XML database
CN111627552A (en) Medical streaming data blood relationship analysis and storage method and device
CN112307191A (en) Multi-system interactive log query method, device, equipment and storage medium
KR20170115109A (en) Text-Mining Application Technique for Productive Construction Document Management
CN111143370B (en) Method, apparatus and computer-readable storage medium for analyzing relationships between a plurality of data tables
CN110765402A (en) Visual acquisition system and method based on network resources
CN105677723A (en) Method for establishing and searching data labels for industrial signal source
CN117076742A (en) Data blood edge tracking method and device and electronic equipment
US20180060404A1 (en) Schema abstraction in data ecosystems
WO2016206395A1 (en) Weekly report information processing method and device
CN116244486A (en) Crawling data processing method and system based on data stream
CN116450664A (en) Data processing method, device, equipment and storage medium
CN109948015B (en) Meta search list result extraction method and system
CN114882242A (en) Violation image identification method and system based on computer vision
CN112131215B (en) Bottom-up database information acquisition method and device
JP5444071B2 (en) Fault information collection system, method and program
CN111352824A (en) Test method and device and computer equipment
CN112925856B (en) Entity relationship analysis method, entity relationship analysis device, entity relationship analysis equipment and computer storage medium
CN111931502B (en) Word segmentation processing method and system and word segmentation searching method
CN110020050B (en) Method for realizing intelligent capture rule configuration technology based on standard documents
Azeroual A text and data analytics approach to enrich the quality of unstructured research information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination