WO2021012553A1 - Data processing method and related device - Google Patents

Data processing method and related device Download PDF

Info

Publication number
WO2021012553A1
WO2021012553A1 PCT/CN2019/120960 CN2019120960W WO2021012553A1 WO 2021012553 A1 WO2021012553 A1 WO 2021012553A1 CN 2019120960 W CN2019120960 W CN 2019120960W WO 2021012553 A1 WO2021012553 A1 WO 2021012553A1
Authority
WO
WIPO (PCT)
Prior art keywords
target data
index
target
server
preset
Prior art date
Application number
PCT/CN2019/120960
Other languages
French (fr)
Chinese (zh)
Inventor
张松松
冯承勇
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021012553A1 publication Critical patent/WO2021012553A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/243Natural language query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2468Fuzzy queries

Definitions

  • This application relates to the field of data processing technology, and in particular to a data processing method and related equipment.
  • many data management platforms can provide the function of querying target data through keywords. If you need to use the target data query function, you first need to create an index corresponding to the target data in the storage area corresponding to the server. After the index is created, you can query the target data through keywords. Among them, the keyword is obtained by performing word segmentation processing on the target data.
  • the storage of target data in traditional services is generally stored in relational databases, such as MySql and Oracle. If the server creates an index corresponding to the target data through a relational database to realize the query function of the target data, a set of database services needs to be maintained, and the unstructured target data cannot be well segmented to create an index. Therefore, how to create an index more efficiently and store the target data so as to realize the query of the target data has become an urgent problem to be solved.
  • the embodiment of the present application provides a data processing method and related equipment, which is beneficial to improve the efficiency of index creation.
  • an embodiment of the present application provides a data processing method, which is applied to a server, and the method includes:
  • the field analysis result including the field corresponding to the target data and the semantic information of the field
  • the storage structure type to which the target data belongs based on the semantic information
  • an embodiment of the present application provides a data processing device, which includes a module for executing the method of the first aspect.
  • an embodiment of the present application provides a server, the server includes a processor, a network interface, and a memory, the processor, the network interface, and the memory are connected to each other, wherein the network interface is controlled by the processor Used to send and receive messages, the memory is used to store a computer program that supports the server to execute the above method, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method of the above first aspect.
  • an embodiment of the present application provides a computer non-volatile readable storage medium, the computer non-volatile readable storage medium stores a computer program, the computer program includes program instructions, and the program instructions When executed by a processor, the processor is caused to execute the method of the first aspect.
  • the server can create a target index according to the storage structure type to which the target data belongs to prevent the splitting of unnecessary fields, which is beneficial to improve the efficiency of index creation.
  • FIG. 1 is a schematic flowchart of a data processing method provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of another data processing method provided by an embodiment of the present application.
  • FIG. 3 is a schematic block diagram of a data processing device provided by an embodiment of the present application.
  • Fig. 4 is a schematic block diagram of a server provided by an embodiment of the present application.
  • Fig. 1 is a schematic flowchart of a data processing method provided by an embodiment of the present application. The method is applied to a server and can be executed by the server. As shown in the figure, the data processing method may include:
  • the 102 Perform field analysis on the target data to obtain a field analysis result, where the field analysis result includes the field corresponding to the target data and the semantic information of the field.
  • the above-mentioned server may be a server corresponding to the data management platform, and the server may be one server or a server cluster composed of multiple servers, and the server may provide related services of data management.
  • the data management platform may be a log platform, and the log cloud platform may provide a log query function through keywords.
  • the client can be an application or website corresponding to the log platform, or a terminal device installed with the log platform application or opening the log platform website.
  • the target data may be unstructured data.
  • 103 Detect whether there is a target index corresponding to the field in the preset index storage area, and if it is detected that the target index does not exist in the preset index storage area, determine the storage structure type to which the target data belongs based on semantic information.
  • the server needs to create an index corresponding to the target data in the index storage area in advance. After the index is created, the target data can be queried by keywords, where the index and the keyword have a corresponding relationship .
  • the preset index storage area includes at least one index, and each index corresponds to a keyword.
  • the server may perform field analysis on the target data to obtain at least one field corresponding to the target data. Further, the above-mentioned at least one field can be compared with the pre-stored keywords of each index, and if any field in the at least one field matches a keyword of any index, it is determined that there is a preset index storage area Target index.
  • the server detects that there is no field matching the keyword of any index in the at least one field, it determines that the target index does not exist in the preset index storage area.
  • the foregoing storage structure types may include keyword types and word segmentation types.
  • the server when the server detects that there is no target index in the preset index storage area, it can detect the semantic information of the target data corresponding to each field, and if it is detected based on the semantic information that the target data corresponds to any field for complete matching Search, the storage structure type to which the target data belongs is determined as the keyword type.
  • the storage structure type to which the target data belongs is determined as the word segmentation type.
  • the field used for complete matching search can be called the first field, and the first field is unique.
  • the semantic information of the first field can represent the user name, certificate number, etc., and each user corresponds to only one user name.
  • the certificate number; among them, the field used for fuzzy search can be called the second field, and the semantic information of the second field is not unique.
  • the semantic information of the second field can represent the company name, etc., and the company name can correspond to Multiple users.
  • the server can create different storage structures according to different target data, which can avoid the search server from splitting unnecessary characters when creating the target index, and can effectively improve the efficiency of data processing.
  • the index request carries the storage structure type and target data, so that the search server can create a target matching the storage structure type for the target data according to the index request index.
  • a configuration file for the search server may be pre-configured, and the configuration file contains the address, port, protocol, connection timeout period, the number of routes of the protocol, and the maximum number of connections related to the connection to the search server. Further, when the server detects that the search server is started, it can register with the search server based on the configuration file, so as to realize subsequent data interaction with the search server, that is, establish a connection with the search server.
  • the search server may be ElasticSearch, which is a search server based on Lucene. It provides a full-text search engine with distributed multi-user capabilities, based on a RESTful web interface. Elasticsearch is developed in Java and released as an open source under the terms of the Apache license. It is a popular enterprise search engine. Designed for cloud computing, it can achieve real-time search, stable, reliable, fast, easy to install and use.
  • the server may send an index request for creating a target index to the search server to which the connection is established.
  • the index request includes the storage structure type of the target data and the target data.
  • the search server can automatically create a target index matching the storage structure type for the target data based on the target data and the storage structure type, and return the target index to the server.
  • the server receives the target index returned by the search server, it can store the target data in the storage area corresponding to the target index, and assign keywords to the target data, so that the keyword can be subsequently used to query the target data.
  • the storage area corresponding to the target index may be a disk or a folder.
  • the above data processing method can be applied to a plug-in corresponding to a data management platform, and the plug-in is inserted into the search server.
  • the data management platform is a log platform
  • the plug-in can create a folder of its own service under the service folder of the search server.
  • the folder includes the log cloud plug-in in the form of a jar package and the current plug-in Configuration files required by the operation information and log platform.
  • the operation information includes: the description information of the plug-in, which is used to describe the role of the plug-in; the version information of the plug-in; the name of the plug-in displayed in the search server; the entrance of the plug-in, the java version information used by the plug-in; The corresponding specific version.
  • the target data may be directly stored in the target index.
  • the server can create a target index according to the storage structure type to which the target data belongs to prevent the splitting of unnecessary fields, which is beneficial to improve the efficiency of index creation.
  • Figure 2 is a schematic flowchart of another data processing method provided by an embodiment of the present application.
  • the method can be executed by a server.
  • the data processing method can include:
  • the field analysis result includes the field corresponding to the target data and the semantic information of the field.
  • 202 Detect whether there is a target index corresponding to the field in the preset index storage area, and if it is detected that the target index does not exist in the preset index storage area, determine the storage structure type to which the target data belongs based on semantic information.
  • the index request carries the storage structure type and target data, so that the search server can create a target matching the storage structure type for the target data according to the index request. index.
  • step 201 to step 204 For the specific implementation manners of step 201 to step 204 above, reference may be made to the related description of step 101 to step 105 in the above embodiment, which will not be repeated here.
  • the aforementioned preset update strategy may include a delayed update strategy or a time update strategy, where the aforementioned delayed update strategy is used to instruct the client to update the target data when a trigger operation for the target data is detected; the time update The strategy is used to instruct the client to update the target data after a preset time.
  • the trigger operation may be a search operation for searching target data, a viewing operation for viewing target data, etc., or any other operation for target data, which is not specifically limited in this application.
  • the preset time can be 0s, 1s, etc., which can be preset by the developer by default, or can be selected by the user according to their own needs. Or, after the preset time has been determined, the user can also adjust the preset time according to his own needs. Among them, the 0s can be understood as refreshing immediately.
  • the above-mentioned preset update strategy is a time update strategy, which is used to instruct the client to update the target data after 0s.
  • the client after the client receives the update instruction information for the target data, it can Update the target data immediately.
  • the aforementioned delayed update strategy is used to instruct the client to update the target data when it detects a viewing operation for the target data.
  • the viewing operation may be, for example, a touch operation on a viewing button, or a voice signal used to view target data, and so on.
  • Delayed refresh will refresh the target data next time there is a trigger operation to ensure the protection of hardware performance and users can query the saved target data in time.
  • a preset index may be generated, and based on the preset index Set index to store the target data
  • the waiting time ie, preset time
  • the waiting time for waiting for the search server to return the target index can be preset, and when an index request for creating the target index is sent to the pre-connected search server, a timer is started for timing, If it is detected that the current duration corresponding to the timer is greater than or equal to the waiting duration, and the returned target index has not been received, a preset index may be obtained, and the target data may be stored in the storage area corresponding to the preset index.
  • the server after the server generates a preset index, it can also create an asynchronous thread for receiving the target index returned by the search server, that is, start a new thread for continuing to wait for the search server to return the target index. Further, if the server receives the target index returned by the search server through the aforementioned asynchronous thread, the preset index is updated through the target index, and the target data is stored according to the target index.
  • a callback listener may be registered in advance for the event of receiving the target index returned by the search server.
  • the callback listener may be opened (ie, create One of the aforementioned asynchronous threads) is used to receive the target index returned by the search server.
  • the data processing method in the embodiment of the present application is applied to a server cluster, and the server cluster is deployed with N nodes (such as N servers).
  • the target data can be determined based on the number of nodes in the server cluster.
  • the number of secondary movies, and the number of primary and secondary movies corresponding to each node are added to the index request, and sent to the search server.
  • the search server can slice the target data according to the number of primary and secondary slices corresponding to each node, and create a target index corresponding to each node.
  • the target index is returned to the corresponding node. In this way, the number of primary and secondary slices can be set according to the number of deployed nodes in the cluster, which effectively reduces the waste of hardware resources caused by the excessive number of secondary slices.
  • the number of primary and secondary pieces corresponding to each node can be based on the secondary piece of the second node
  • the number is the same as the number of main slices of the first node
  • the number of secondary slices of the third node is the same as the number of main slices of the second node
  • the main film and the secondary film store the same data, which can prevent data loss caused by hardware problems.
  • the server may not need to update the target data immediately, which consumes a large amount of hardware performance, which is beneficial to both the protection of the hardware performance and the timeliness of the user inquiring about the target data.
  • the embodiment of the present application also provides a data processing device.
  • the device includes a module for executing the method described in FIG. 1 or FIG. 2, and is configured on a server.
  • FIG. 3 is a schematic block diagram of a data processing apparatus provided by an embodiment of the present application.
  • the data processing device of this embodiment includes:
  • the communication module 30 is configured to receive a storage request for storing target data from a client, where the storage request includes the target data;
  • the processing module 31 is configured to perform field analysis on the target data to obtain a field analysis result, where the field analysis result includes a field corresponding to the target data and semantic information of the field;
  • the processing module 31 is further configured to detect whether there is a target index corresponding to the field in the preset index storage area, and if it is detected that the target index does not exist in the preset index storage area, based on the semantic information Determine the storage structure type to which the target data belongs;
  • the communication module 30 is further configured to send an index request for creating a target index to the pre-connected search server, and receive the target index returned by the search server, and the index request carries the storage structure type and the target index.
  • the target data so that the search server can create a target index matching the storage structure type for the target data according to the index request;
  • the processing module 31 is further configured to store the target data according to the target index.
  • the processing module 31 is specifically configured to determine the storage structure type to which the target data belongs as the key if the field corresponding to the target data is detected based on the semantic information for complete matching search. Word type.
  • the processing module 31 is specifically further configured to determine the storage structure type to which the target data belongs if it is detected based on the semantic information that the field corresponding to the target data is used for fuzzy matching search Word segmentation type.
  • the communication module 30 is further configured to send update instruction information for the target data to the client after storing the target data in the target index, and the update instruction information is used Instruct the client to update the target data according to a preset update strategy.
  • the preset update strategy includes a delayed update strategy or a time update strategy, wherein the delayed update strategy is used to instruct the client to update the target when a trigger operation for target data is detected. Data; the time update strategy is used to instruct the client to update the target data after a preset time.
  • the processing module 31 is further configured to send an index request for creating a target index to a pre-connected search server, if the target index returned by the search server is not received within a preset time When indexing, a preset index is generated, and the target data is stored according to the preset index.
  • the processing module 31 is further configured to create an asynchronous thread for receiving the target index returned by the search server after generating a preset index; if the search server is received through the asynchronous thread When returning to the target index, the preset index is updated through the target index, and the target data is stored according to the target index.
  • the processing module 31 when the communication module 30 receives a storage request for storing target data from the client, the processing module 31 performs field analysis on the target data, obtains the field corresponding to the target data and the semantic information of the field, and detects the prediction Assuming whether there is a target index corresponding to the field in the index storage area, if it is detected that the target index does not exist in the preset index storage area, the storage structure type to which the target data belongs is determined based on the semantic information, and the pre-connected search is performed through the communication module 30
  • the server sends an index request for creating a target index, and then receives the target index returned by the search server, and stores the target data according to the target index through the processing module 31.
  • the target index can be created according to the storage structure type to which the target data belongs to prevent the splitting of unnecessary fields, which is beneficial to improve the efficiency of index creation.
  • FIG. 4 is a schematic block diagram of a server according to an embodiment of the present application.
  • the server includes a processor 401, a memory 402, and a network interface 403.
  • the processor 401, the memory 402, and the network interface 403 may be connected by a bus or in other ways.
  • the connection by a bus is taken as an example.
  • the network interface 403 is controlled by the processor to send and receive messages, the memory 402 is used to store a computer program, and the computer program includes program instructions, and the processor 401 is used to execute the program instructions stored in the memory 402.
  • the processor 401 is configured to call the program instructions to execute: when receiving a storage request for storing target data from a client through the network interface 403, perform field analysis on the target data to obtain a field analysis result ,
  • the field analysis result includes the field corresponding to the target data and the semantic information of the field; detecting whether the target index corresponding to the field exists in the preset index storage area; if it is detected in the preset index storage area If the target index does not exist, the storage structure type to which the target data belongs is determined based on the semantic information; an index request for creating a target index is sent to the pre-connected search server through the network interface 403, and the index request carries all The storage structure type and the target data, so that the search server can create a target index matching the storage structure type for the target data according to the index request; and receive the return from the search server through the network interface 403
  • the target index, and the target data is stored according to the target index.
  • the processor 401 is specifically configured to determine that the storage structure type to which the target data belongs is the key if it is detected based on the semantic information that the field corresponding to the target data is used for complete matching search. Word type.
  • the processor 401 is specifically further configured to determine that the storage structure type to which the target data belongs is if it is detected based on the semantic information that the field corresponding to the target data is used for fuzzy matching search Word segmentation type.
  • the network interface 403 is further configured to send update instruction information for the target data to the client after storing the target data in the target index, and the update instruction information is used Instruct the client to update the target data according to a preset update strategy.
  • the preset update strategy includes a delayed update strategy or a time update strategy, wherein the delayed update strategy is used to instruct the client to update the target when a trigger operation for target data is detected. Data; the time update strategy is used to instruct the client to update the target data after a preset time.
  • the processor 401 is further configured to send an index request for creating a target index to a pre-connected search server, if the target index returned by the search server is not received within a preset time When indexing, a preset index is generated, and the target data is stored according to the preset index.
  • the processor 401 is further configured to create an asynchronous thread for receiving the target index returned by the search server after generating a preset index; if the search server is received through the asynchronous thread When returning to the target index, the preset index is updated through the target index, and the target data is stored according to the target index.
  • the processor 401 may be a central processing unit (Central Processing Unit, CPU), and the processor 401 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs). ), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 402 may include a read-only memory and a random access memory, and provides instructions and data to the processor 401.
  • a part of the memory 402 may also include a non-volatile random access memory.
  • the memory 402 may also store device type information.
  • processor 401, memory 402, and network interface 403 described in the embodiment of the present application can execute the implementation described in the method embodiment described in FIG. 1 or FIG.
  • the implementation of the data processing device described in the embodiment of the present application will not be repeated here.
  • a computer non-volatile readable storage medium stores a computer program.
  • the computer program includes program instructions. When executed by the processor, it is realized: when a storage request for storing target data is received from a client, the target data is subjected to field analysis to obtain a field analysis result, and the field analysis result includes the data corresponding to the target data.
  • Field and semantic information of the field detect whether there is a target index corresponding to the field in the preset index storage area; if it is detected that the target index does not exist in the preset index storage area, based on the semantic information Determine the storage structure type to which the target data belongs; send an index request for creating a target index to a pre-connected search server, where the index request carries the storage structure type and the target data, so that the search server can follow The index request creates a target index matching the storage structure type for the target data; receives the target index returned by the search server, and stores the target data according to the target index.
  • the computer non-volatile readable storage medium may be the internal storage unit of the server described in any of the foregoing embodiments, such as the hard disk or memory of the server.
  • the computer non-volatile readable storage medium may also be an external storage device of the server, such as a plug-in hard disk equipped on the server, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital) , SD) card, flash card (Flash Card), etc.
  • the computer non-volatile readable storage medium may also include both an internal storage unit of the server and an external storage device.
  • the computer non-volatile readable storage medium is used to store the computer program and other programs and data required by the server.
  • the computer non-volatile readable storage medium can also be used to temporarily store data that has been output or will be output.
  • the program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments.
  • the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Automation & Control Theory (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data processing method and a related device. The method comprises: when receiving a storage request for storing target data from a client, performing field analysis on the target data; if on the basis of a field analysis result, it is detected that a target index does not exist in a preset index storage area, determine, on the basis of semantic information, the storage structure type to which the target data belongs, and send an index request for creating the target index to a preconnected search server, and when the target index returned by the search server is received, storing the target data according to the target index. The method can create the target index according to the storage structure type to which the target data belongs, and can improve the index creation efficiency.

Description

一种数据处理方法及相关设备A data processing method and related equipment
本申请要求于2019年7月25日提交中国专利局、申请号为201910679327.9、申请名称为“一种数据处理方法及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is 201910679327.9, and the application name is "a data processing method and related equipment" on July 25, 2019, the entire content of which is incorporated into this application by reference in.
技术领域Technical field
本申请涉及数据处理技术领域,尤其涉及一种数据处理方法及相关设备。This application relates to the field of data processing technology, and in particular to a data processing method and related equipment.
背景技术Background technique
目前,许多数据管理平台均可以提供通过关键词进行目标数据查询的功能。如果需要使用目标数据查询的功能,首先需要在服务器对应的存储区域创建该目标数据对应的索引,在索引创建完成后,可以通过关键词查询该目标数据。其中,该关键词是通过对目标数据进行分词处理后得到的。At present, many data management platforms can provide the function of querying target data through keywords. If you need to use the target data query function, you first need to create an index corresponding to the target data in the storage area corresponding to the server. After the index is created, you can query the target data through keywords. Among them, the keyword is obtained by performing word segmentation processing on the target data.
传统服务对目标数据的存储一般都存储到关系型数据库,例如MySql、Oracle等。如果服务器通过关系型数据库创建目标数据对应的索引,从而实现目标数据的查询功能,需要另外维护一套数据库服务,无法对非结构化的目标数据进行很好的分词处理,从而创建索引。因此,如何更加高效地创建索引,存储目标数据,从而实现目标数据的查询,成为一个亟待解决的问题。The storage of target data in traditional services is generally stored in relational databases, such as MySql and Oracle. If the server creates an index corresponding to the target data through a relational database to realize the query function of the target data, a set of database services needs to be maintained, and the unstructured target data cannot be well segmented to create an index. Therefore, how to create an index more efficiently and store the target data so as to realize the query of the target data has become an urgent problem to be solved.
发明内容Summary of the invention
本申请实施例提供了一种数据处理方法及相关设备,有利于提高创建索引的效率。The embodiment of the present application provides a data processing method and related equipment, which is beneficial to improve the efficiency of index creation.
第一方面,本申请实施例提供了一种数据处理方法,所述方法应用于服务器,该方法包括:In the first aspect, an embodiment of the present application provides a data processing method, which is applied to a server, and the method includes:
接收来自客户端的用于存储目标数据的存储请求,所述存储请求中包括所述目标数据;Receiving a storage request for storing target data from a client, where the storage request includes the target data;
对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语义信息;Performing field analysis on the target data to obtain a field analysis result, the field analysis result including the field corresponding to the target data and the semantic information of the field;
检测预设索引存储区域中是否存在所述字段对应的目标索引;Detecting whether there is a target index corresponding to the field in the preset index storage area;
若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;If it is detected that the target index does not exist in the preset index storage area, determine the storage structure type to which the target data belongs based on the semantic information;
向预连接的搜索服务器发送用于创建目标索引的索引请求,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;Send an index request for creating a target index to the pre-connected search server, the index request carrying the storage structure type and the target data, so that the search server can create the target data according to the index request A target index matching the storage structure type;
接收所述搜索服务器返回的所述目标索引,并根据所述目标索引存储所述目标数据。Receiving the target index returned by the search server, and storing the target data according to the target index.
第二方面,本申请实施例提供了一种数据处理装置,该数据处理装置包括用于执行上述第一方面的方法的模块。In a second aspect, an embodiment of the present application provides a data processing device, which includes a module for executing the method of the first aspect.
第三方面,本申请实施例提供了一种服务器,该服务器包括处理器、网络接口和存储器,所述处理器、网络接口和存储器相互连接,其中,所述网络接口受所述处理器的控制用于收发消息,所述存储器用于存储支持服务器执行上述方法的计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行上述第一方面的方法。In a third aspect, an embodiment of the present application provides a server, the server includes a processor, a network interface, and a memory, the processor, the network interface, and the memory are connected to each other, wherein the network interface is controlled by the processor Used to send and receive messages, the memory is used to store a computer program that supports the server to execute the above method, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method of the above first aspect.
第四方面,本申请实施例提供了一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行上述第一方面的方法。In a fourth aspect, an embodiment of the present application provides a computer non-volatile readable storage medium, the computer non-volatile readable storage medium stores a computer program, the computer program includes program instructions, and the program instructions When executed by a processor, the processor is caused to execute the method of the first aspect.
本申请实施例中,服务器可以根据目标数据所属的存储结构类型,创建目标索引,防止对不必要字段的拆分,有利于提高创建索引的效率。In the embodiment of the present application, the server can create a target index according to the storage structure type to which the target data belongs to prevent the splitting of unnecessary fields, which is beneficial to improve the efficiency of index creation.
附图说明Description of the drawings
图1是本申请实施例提供的一种数据处理方法的流程示意图;FIG. 1 is a schematic flowchart of a data processing method provided by an embodiment of the present application;
图2是本申请实施例提供的另一种数据处理方法的流程示意图;2 is a schematic flowchart of another data processing method provided by an embodiment of the present application;
图3是本申请实施例提供的一种数据处理装置的示意性框图;FIG. 3 is a schematic block diagram of a data processing device provided by an embodiment of the present application;
图4是本申请实施例提供的一种服务器的示意性框图。Fig. 4 is a schematic block diagram of a server provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清 楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
参见图1,图1是本申请实施例提供的一种数据处理方法的流程示意图,该方法应用于服务器,可由服务器执行,如图所示,该数据处理方法可包括:Referring to Fig. 1, Fig. 1 is a schematic flowchart of a data processing method provided by an embodiment of the present application. The method is applied to a server and can be executed by the server. As shown in the figure, the data processing method may include:
101:接收来自客户端的用于存储目标数据的存储请求,该存储请求中包括目标数据。101: Receive a storage request for storing target data from a client, where the storage request includes the target data.
102:对目标数据进行字段解析,以得到字段解析结果,该字段解析结果包括所述目标数据对应的字段以及字段的语义信息。102: Perform field analysis on the target data to obtain a field analysis result, where the field analysis result includes the field corresponding to the target data and the semantic information of the field.
其中,上述服务器可以数据管理平台对应的服务器,该服务器可以为一台服务器,也可以为多台服务器组成的服务器集群,该服务器可以提供数据管理的相关服务。例如,该数据管理平台可以为日志平台,该日志云平台可以提供通过关键词进行日志查询的功能。其中,该客户端可以为日志平台对应的应用或者网站,也可以为安装有日志平台应用或者开启日志平台网站的终端设备。在一个实施例中,该目标数据可以为非结构化的数据。The above-mentioned server may be a server corresponding to the data management platform, and the server may be one server or a server cluster composed of multiple servers, and the server may provide related services of data management. For example, the data management platform may be a log platform, and the log cloud platform may provide a log query function through keywords. Wherein, the client can be an application or website corresponding to the log platform, or a terminal device installed with the log platform application or opening the log platform website. In one embodiment, the target data may be unstructured data.
103:检测预设索引存储区域中是否存在所述字段对应的目标索引,若检测到预设索引存储区域中不存在目标索引,则基于语义信息确定目标数据所属的存储结构类型。103: Detect whether there is a target index corresponding to the field in the preset index storage area, and if it is detected that the target index does not exist in the preset index storage area, determine the storage structure type to which the target data belongs based on semantic information.
其中,服务器为了实现目标数据的查询功能,服务器需要预先在索引存储区域中创建目标数据对应的索引,创建索引之后,可以通过关键词进行目标数据的查询,其中,该索引与关键词具有对应关系。在一个实施例中,预设索引存储区域中包括至少一个索引,每个索引对应有关键词。针对这种情况,当服务器接收到来自客户端的用于存储目标数据的存储请求时,可以对目标数据进行字段解析,得到目标数据对应的至少一个字段。进一步地,可以将上述至少一个字段与预先存储的各个索引的关键词进行对比,若对比得到至少一个字段中的任一字段与任一索引的关键词匹配,则确定预设索引存储区域中存在目标索引。Among them, in order for the server to realize the query function of the target data, the server needs to create an index corresponding to the target data in the index storage area in advance. After the index is created, the target data can be queried by keywords, where the index and the keyword have a corresponding relationship . In one embodiment, the preset index storage area includes at least one index, and each index corresponds to a keyword. In response to this situation, when the server receives a storage request for storing target data from the client, it may perform field analysis on the target data to obtain at least one field corresponding to the target data. Further, the above-mentioned at least one field can be compared with the pre-stored keywords of each index, and if any field in the at least one field matches a keyword of any index, it is determined that there is a preset index storage area Target index.
相反地,若服务器检测到上述至少一个字段中不存在与任一索引的关键词匹配的字段,则确定预设索引存储区域中不存在目标索引。On the contrary, if the server detects that there is no field matching the keyword of any index in the at least one field, it determines that the target index does not exist in the preset index storage area.
其中,上述存储结构类型可以包括关键字类型和分词类型。在一个实施例中,当服务器检测到预设索引存储区域中不存在目标索引时,可以检测目标数据对应各个字段的语义信息,若基于该语义信息检测到目标数据对应任一字段用于完整匹配查找,则将目标数据所属的存储结构类型确定为关键字类型。Among them, the foregoing storage structure types may include keyword types and word segmentation types. In one embodiment, when the server detects that there is no target index in the preset index storage area, it can detect the semantic information of the target data corresponding to each field, and if it is detected based on the semantic information that the target data corresponds to any field for complete matching Search, the storage structure type to which the target data belongs is determined as the keyword type.
若基于语义信息检测到目标数据对应的任一字段用于模糊匹配查找,则将目标数据所属的存储结构类型确定为分词类型。其中,用于完整匹配查找的字段可以称为第一字段,该第一字段具有唯一性,例如该第一字段的语义信息可以表征用户姓名、证件号等,每一个用户仅对应一个用户姓名,以及证件号;其中,用于模糊查找的字段可以称为第二字段,该第二字段的语义信息不具有唯一性,例如该第二字段的语义信息可以表征公司名称等,该公司名称可以对应多个用户。采用这样的方式,服务器可以根据不同的目标数据,创建不同的存储结构,可以避免搜索服务器在创建目标索引时,对不必要的字符进行拆分,可以有效提高数据处理效率。If it is detected that any field corresponding to the target data is used for fuzzy matching search based on the semantic information, the storage structure type to which the target data belongs is determined as the word segmentation type. Among them, the field used for complete matching search can be called the first field, and the first field is unique. For example, the semantic information of the first field can represent the user name, certificate number, etc., and each user corresponds to only one user name. And the certificate number; among them, the field used for fuzzy search can be called the second field, and the semantic information of the second field is not unique. For example, the semantic information of the second field can represent the company name, etc., and the company name can correspond to Multiple users. In this way, the server can create different storage structures according to different target data, which can avoid the search server from splitting unnecessary characters when creating the target index, and can effectively improve the efficiency of data processing.
104:向预连接的搜索服务器发送用于创建目标索引的索引请求,该索引请求携带存储结构类型以及目标数据,以便于搜索服务器根据该索引请求,为目标数据创建与该存储结构类型匹配的目标索引。104: Send an index request for creating a target index to the pre-connected search server. The index request carries the storage structure type and target data, so that the search server can create a target matching the storage structure type for the target data according to the index request index.
105:接收搜索服务器返回的目标索引,并根据目标索引存储目标数据。105: Receive the target index returned by the search server, and store the target data according to the target index.
在一个实施例中,可以预先配置针对搜索服务器的配置文件,该配置文件中包含了关于连接搜索服务器的地址、端口、协议、连接超时时间、协议的路由数和最大连接数等相关配置。进一步地,当服务器检测到上述搜索服务器启动时,可以基于上述配置文件在搜索服务器注册,从而实现后续与搜索服务器之间的数据交互,即与搜索服务器建立连接。In one embodiment, a configuration file for the search server may be pre-configured, and the configuration file contains the address, port, protocol, connection timeout period, the number of routes of the protocol, and the maximum number of connections related to the connection to the search server. Further, when the server detects that the search server is started, it can register with the search server based on the configuration file, so as to realize subsequent data interaction with the search server, that is, establish a connection with the search server.
其中,该搜索服务器可以为ElasticSearch,该ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口。Elasticsearch是用Java开发的,并作为Apache许可条款下的开放源码发布,是当前流行的企业级搜索引擎。设计用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。Wherein, the search server may be ElasticSearch, which is a search server based on Lucene. It provides a full-text search engine with distributed multi-user capabilities, based on a RESTful web interface. Elasticsearch is developed in Java and released as an open source under the terms of the Apache license. It is a popular enterprise search engine. Designed for cloud computing, it can achieve real-time search, stable, reliable, fast, easy to install and use.
在一个实施例中,当服务器确定出目标数据所属的存储结构类型之后,可以向以建立连接的搜索服务器发送创建目标索引的索引请求,该索引请求包括 目标数据的存储结构类型以及该目标数据。进一步地,搜索服务器可以基于该目标数据以及存储结构类型,自动为该目标数据创建出与该存储结构类型匹配的目标索引,并将该目标索引返回服务器。服务器接收到搜索服务器返回的目标索引后,可以在目标索引对应的存储区域中存储该目标数据,并为该目标数据分配关键字,以便于后续使用该关键字,查询到目标数据。其中,该目标索引对应的存储区域可以为磁盘或者文件夹。In one embodiment, after the server determines the storage structure type to which the target data belongs, it may send an index request for creating a target index to the search server to which the connection is established. The index request includes the storage structure type of the target data and the target data. Further, the search server can automatically create a target index matching the storage structure type for the target data based on the target data and the storage structure type, and return the target index to the server. After the server receives the target index returned by the search server, it can store the target data in the storage area corresponding to the target index, and assign keywords to the target data, so that the keyword can be subsequently used to query the target data. Wherein, the storage area corresponding to the target index may be a disk or a folder.
在一个实施例中,上述数据处理方法可应用于一个数据管理平台对应的插件,该插件插入的对象为搜索服务器。示例性地,该数据管理平台为一个日志平台,该插件可以在搜索服务器的服务文件夹下创建属于自己服务的文件夹,该文件夹中包括以jar包形式存在的日志云插件、当前插件的运行信息和日志平台需要的配置文件。其中,该运行信息包括:插件的描述信息,用来描述该插件的作用;插件的版本信息;插件在搜索服务器中显示的名称;插件的入口,插件采用的java版本信息;插件发布到搜索服务器对应的特定版本。In one embodiment, the above data processing method can be applied to a plug-in corresponding to a data management platform, and the plug-in is inserted into the search server. Exemplarily, the data management platform is a log platform, and the plug-in can create a folder of its own service under the service folder of the search server. The folder includes the log cloud plug-in in the form of a jar package and the current plug-in Configuration files required by the operation information and log platform. Among them, the operation information includes: the description information of the plug-in, which is used to describe the role of the plug-in; the version information of the plug-in; the name of the plug-in displayed in the search server; the entrance of the plug-in, the java version information used by the plug-in; The corresponding specific version.
在一个实施例中,若检测到上述预设索引存储区域中存在目标索引,可以直接在该目标索引中存储该目标数据。In one embodiment, if it is detected that there is a target index in the aforementioned preset index storage area, the target data may be directly stored in the target index.
本申请实施例中,服务器可以根据目标数据所属的存储结构类型,创建目标索引,防止对不必要字段的拆分,有利于提高创建索引的效率。In the embodiment of the present application, the server can create a target index according to the storage structure type to which the target data belongs to prevent the splitting of unnecessary fields, which is beneficial to improve the efficiency of index creation.
参见图2,图2是本申请实施例提供的另一种数据处理方法的流程示意图,该方法可由服务器执行,如图所示,该数据处理方法可包括:Referring to Figure 2, Figure 2 is a schematic flowchart of another data processing method provided by an embodiment of the present application. The method can be executed by a server. As shown in the figure, the data processing method can include:
201:当接收到来自客户端的用于存储目标数据的存储请求时,对目标数据进行字段解析,以得到字段解析结果,该字段解析结果包括所述目标数据对应的字段以及字段的语义信息。201: When receiving a storage request for storing target data from a client, perform field analysis on the target data to obtain a field analysis result. The field analysis result includes the field corresponding to the target data and the semantic information of the field.
202:检测预设索引存储区域中是否存在所述字段对应的目标索引,若检测到预设索引存储区域中不存在目标索引,则基于语义信息确定目标数据所属的存储结构类型。202: Detect whether there is a target index corresponding to the field in the preset index storage area, and if it is detected that the target index does not exist in the preset index storage area, determine the storage structure type to which the target data belongs based on semantic information.
203:向预连接的搜索服务器发送用于创建目标索引的索引请求,该索引请求携带存储结构类型以及目标数据,以便于搜索服务器根据该索引请求,为目标数据创建与该存储结构类型匹配的目标索引。203: Send an index request for creating a target index to the pre-connected search server. The index request carries the storage structure type and target data, so that the search server can create a target matching the storage structure type for the target data according to the index request. index.
204:接收搜索服务器返回的目标索引,并根据目标索引存储目标数据。204: Receive the target index returned by the search server, and store the target data according to the target index.
其中,上述步骤201~步骤204的具体实施方式,可以参见上述实施例中步骤101~步骤105的相关描述,此处不再赘述。For the specific implementation manners of step 201 to step 204 above, reference may be made to the related description of step 101 to step 105 in the above embodiment, which will not be repeated here.
205:向上述客户端发送针对目标数据的更新指示信息,该更新指示信息用于指示客户端按照预设更新策略更新目标数据。205: Send update instruction information for the target data to the foregoing client, where the update instruction information is used to instruct the client to update the target data according to a preset update strategy.
在一个实施例中,上述预设更新策略可以包括延迟更新策略或者时间更新策略,其中,上述延迟更新策略用于指示客户端在检测到针对目标数据的触发操作时,更新目标数据;该时间更新策略,用于指示客户端在预设时间后更新目标数据。其中,该触发操作可以为搜索目标数据的搜索操作,也可以为查看目标数据的查看操作等等,或者,其他针对目标数据的任何操作,本申请对此不做具体限定。In one embodiment, the aforementioned preset update strategy may include a delayed update strategy or a time update strategy, where the aforementioned delayed update strategy is used to instruct the client to update the target data when a trigger operation for the target data is detected; the time update The strategy is used to instruct the client to update the target data after a preset time. The trigger operation may be a search operation for searching target data, a viewing operation for viewing target data, etc., or any other operation for target data, which is not specifically limited in this application.
在一个实施例中,该预设时间可以为0s,1s等等,均可以为预先由开发人员默认设置的,也可以由用户根据自身需求选择的。或者,当该预设时间已经确定后,用户也可以根据自身需求据对该预设时间进行调整。其中,该0s可以理解为立即刷新。In one embodiment, the preset time can be 0s, 1s, etc., which can be preset by the developer by default, or can be selected by the user according to their own needs. Or, after the preset time has been determined, the user can also adjust the preset time according to his own needs. Among them, the 0s can be understood as refreshing immediately.
示例性地,上述预设更新策略为时间更新策略,该时间更新策略,用于指示客户端在0s后更新目标数据,这种情况下,客户端接收到针对目标数据的更新指示信息后,可以立即更新该目标数据。Exemplarily, the above-mentioned preset update strategy is a time update strategy, which is used to instruct the client to update the target data after 0s. In this case, after the client receives the update instruction information for the target data, it can Update the target data immediately.
在一个实施例中,当上述预设更新策略为延迟更新策略时,上述延迟更新策略用于指示客户端在检测到针对目标数据的查看操作时,更新目标数据。其中,该查看操作例如可以为针对查看按钮的触控操作,或者用于查看目标数据的语音信号等等。采用这样的方式,不需要立即更新目标数据大量消耗硬件的性能,延迟刷新会在下次对目标数据存在触发操作时进行刷新,保证了硬件性能的保护和用户及时可以查询到保存的目标数据。In one embodiment, when the aforementioned preset update strategy is a delayed update strategy, the aforementioned delayed update strategy is used to instruct the client to update the target data when it detects a viewing operation for the target data. Wherein, the viewing operation may be, for example, a touch operation on a viewing button, or a voice signal used to view target data, and so on. In this way, there is no need to update the target data immediately, which consumes a lot of hardware performance. Delayed refresh will refresh the target data next time there is a trigger operation to ensure the protection of hardware performance and users can query the saved target data in time.
在一个实施例中,向预连接的搜索服务器发送用于创建目标索引的索引请求之后,若在预设时间内未接收到搜索服务器返回的目标索引时,可以生成一个预设索引,并根据预设索引存储该目标数据In one embodiment, after sending an index request for creating a target index to a pre-connected search server, if the target index returned by the search server is not received within a preset time, a preset index may be generated, and based on the preset index Set index to store the target data
在一个实施例中,可以预先设置等待搜索服务器返回目标索引的等待时长(即,预设时间),当向预连接的搜索服务器发送用于创建目标索引的索引请 求时,开启计时器进行计时,若检测到计时器对应的当前时长大于或者等于该等待时长时,还未接收到返回的目标索引,则可以获取一个预设索引,并在预设索引对应的存储区域存储该目标数据。In one embodiment, the waiting time (ie, preset time) for waiting for the search server to return the target index can be preset, and when an index request for creating the target index is sent to the pre-connected search server, a timer is started for timing, If it is detected that the current duration corresponding to the timer is greater than or equal to the waiting duration, and the returned target index has not been received, a preset index may be obtained, and the target data may be stored in the storage area corresponding to the preset index.
在一个实施例中,服务器生成一个预设索引之后,还可以创建用于接收搜索服务器返回的目标索引的异步线程,也即,开启一个新的线程,用于继续等待搜索服务器返回目标索引。进一步地,若服务器通过上述异步线程接收到搜索服务器返回的目标索引,则通过目标索引更新预设索引,并根据目标索引存储目标数据。In an embodiment, after the server generates a preset index, it can also create an asynchronous thread for receiving the target index returned by the search server, that is, start a new thread for continuing to wait for the search server to return the target index. Further, if the server receives the target index returned by the search server through the aforementioned asynchronous thread, the preset index is updated through the target index, and the target data is stored according to the target index.
在一个实施例中,可以预先对接收搜索服务器返回目标索引的事件注册一个回调监听器,当服务器在预设时间内未接收到搜索服务器返回的目标索引时,可以开启该回调监听器(即创建一个上述异步线程),用于接收该搜索服务器返回的目标索引。In one embodiment, a callback listener may be registered in advance for the event of receiving the target index returned by the search server. When the server does not receive the target index returned by the search server within a preset time, the callback listener may be opened (ie, create One of the aforementioned asynchronous threads) is used to receive the target index returned by the search server.
在一个实施例中,本申请实施例中的数据处理方法应用于服务器集群,该服务器集群部署有N个节点(如N个服务器)。这种情况下,在向预连接的搜索服务器发送用于创建目标索引的索引请求之前,还可以基于服务器集群中节点的数量确定对目标数据进行分片过程中,每一个节点对应的主片和副片数量,并将每一个节点对应的主片和副片数量添加至索引请求,发送至搜索服务器。进一步地,搜索服务器接收到该索引请求后,可以根据每一个节点对应的主片和副片数量对目标数据进行分片处理,并创建每一个节点各自对应的目标索引,创建完成后,将该目标索引返回至各自对应的节点。采用这样的方式,可以根据集群中部署节点的个数去设置主片和副片的数量,有效的减少了过多副片数量对硬件资源的浪费。In one embodiment, the data processing method in the embodiment of the present application is applied to a server cluster, and the server cluster is deployed with N nodes (such as N servers). In this case, before sending the index request for creating the target index to the pre-connected search server, the target data can be determined based on the number of nodes in the server cluster. The number of secondary movies, and the number of primary and secondary movies corresponding to each node are added to the index request, and sent to the search server. Further, after receiving the index request, the search server can slice the target data according to the number of primary and secondary slices corresponding to each node, and create a target index corresponding to each node. After the creation is completed, The target index is returned to the corresponding node. In this way, the number of primary and secondary slices can be set according to the number of deployed nodes in the cluster, which effectively reduces the waste of hardware resources caused by the excessive number of secondary slices.
示例性地,当存在N个节点时,当基于服务器集群中节点的数量确定对目标数据进行分片过程中,每一个节点对应的主片和副片数量时,可以依据第二节点的副片数量与第一节点的主片数量相同、第三节点的副片数量与第二节点的主片数量相同,以此类推的原则,确定每一个节点对应的主片和副片数量。其中,主片和副片存储的数据相同,可以防止硬件问题导致数据丢失。Exemplarily, when there are N nodes, when the number of target data is determined based on the number of nodes in the server cluster, the number of primary and secondary pieces corresponding to each node can be based on the secondary piece of the second node The number is the same as the number of main slices of the first node, the number of secondary slices of the third node is the same as the number of main slices of the second node, and so on to determine the number of main slices and auxiliary slices corresponding to each node. Among them, the main film and the secondary film store the same data, which can prevent data loss caused by hardware problems.
本申请实施例中,服务器可以不需要立即更新目标数据大量消耗硬件的性能,有利于兼顾对硬件性能的保护和用户查询到目标数据的及时性。In the embodiment of the present application, the server may not need to update the target data immediately, which consumes a large amount of hardware performance, which is beneficial to both the protection of the hardware performance and the timeliness of the user inquiring about the target data.
本申请实施例还提供了一种数据处理装置。该装置包括用于执行前述图1或者图2所述的方法的模块,配置于服务器。具体地,参见图3,是本申请实施例提供的数据处理装置的示意框图。本实施例的数据处理装置包括:The embodiment of the present application also provides a data processing device. The device includes a module for executing the method described in FIG. 1 or FIG. 2, and is configured on a server. Specifically, refer to FIG. 3, which is a schematic block diagram of a data processing apparatus provided by an embodiment of the present application. The data processing device of this embodiment includes:
通信模块30,用于接收来自客户端的用于存储目标数据的存储请求,所述存储请求中包括所述目标数据;The communication module 30 is configured to receive a storage request for storing target data from a client, where the storage request includes the target data;
处理模块31,用于对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语义信息;The processing module 31 is configured to perform field analysis on the target data to obtain a field analysis result, where the field analysis result includes a field corresponding to the target data and semantic information of the field;
所述处理模块31,还用于检测预设索引存储区域中是否存在所述字段对应的目标索引,若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;The processing module 31 is further configured to detect whether there is a target index corresponding to the field in the preset index storage area, and if it is detected that the target index does not exist in the preset index storage area, based on the semantic information Determine the storage structure type to which the target data belongs;
所述通信模块30,还用于向预连接的搜索服务器发送用于创建目标索引的索引请求,并接收所述搜索服务器返回的所述目标索引,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;The communication module 30 is further configured to send an index request for creating a target index to the pre-connected search server, and receive the target index returned by the search server, and the index request carries the storage structure type and the target index. The target data, so that the search server can create a target index matching the storage structure type for the target data according to the index request;
所述处理模块31,还用于根据所述目标索引存储所述目标数据。The processing module 31 is further configured to store the target data according to the target index.
在一个实施例中,所述处理模块31,具体用于若基于所述语义信息检测到所述目标数据对应的字段用于完整匹配查找,则将所述目标数据所属的存储结构类型确定为关键字类型。In one embodiment, the processing module 31 is specifically configured to determine the storage structure type to which the target data belongs as the key if the field corresponding to the target data is detected based on the semantic information for complete matching search. Word type.
在一个实施例中,所述处理模块31,具体还用于若基于所述语义信息检测到所述目标数据对应的字段用于模糊匹配查找,则将所述目标数据所属的存储结构类型确定为分词类型。In one embodiment, the processing module 31 is specifically further configured to determine the storage structure type to which the target data belongs if it is detected based on the semantic information that the field corresponding to the target data is used for fuzzy matching search Word segmentation type.
在一个实施例中,所述通信模块30,还用于在所述目标索引中存储所述目标数据之后,向所述客户端发送针对所述目标数据的更新指示信息,所述更新指示信息用于指示客户端按照预设更新策略更新所述目标数据。In one embodiment, the communication module 30 is further configured to send update instruction information for the target data to the client after storing the target data in the target index, and the update instruction information is used Instruct the client to update the target data according to a preset update strategy.
在一个实施例中,所述预设更新策略包括延迟更新策略或者时间更新策略,其中,所述延迟更新策略用于指示所述客户端在检测到针对目标数据的触发操作时,更新所述目标数据;所述时间更新策略,用于指示所述客户端在预设时间后更新所述目标数据。In one embodiment, the preset update strategy includes a delayed update strategy or a time update strategy, wherein the delayed update strategy is used to instruct the client to update the target when a trigger operation for target data is detected. Data; the time update strategy is used to instruct the client to update the target data after a preset time.
在一个实施例中,所述处理模块31,还用于向预连接的搜索服务器发送用于创建目标索引的索引请求之后,若在预设时间内未接收到所述搜索服务器返回的所述目标索引时,生成一个预设索引,并根据所述预设索引存储所述目标数据。In one embodiment, the processing module 31 is further configured to send an index request for creating a target index to a pre-connected search server, if the target index returned by the search server is not received within a preset time When indexing, a preset index is generated, and the target data is stored according to the preset index.
在一个实施例中,所述处理模块31,还用于生成一个预设索引之后,创建用于接收所述搜索服务器返回的目标索引的异步线程;若通过所述异步线程接收到所述搜索服务器返回所述目标索引,则通过所述目标索引更新所述预设索引,并根据所述目标索引存储所述目标数据。In one embodiment, the processing module 31 is further configured to create an asynchronous thread for receiving the target index returned by the search server after generating a preset index; if the search server is received through the asynchronous thread When returning to the target index, the preset index is updated through the target index, and the target data is stored according to the target index.
需要说明的是,本申请实施例所描述的数据处理装置的各功能模块的功能可根据图1或者图2所述的方法实施例中的方法具体实现,其具体实现过程可以参照图1或者图2的方法实施例的相关描述,此处不再赘述。It should be noted that the functions of each functional module of the data processing device described in the embodiment of the present application can be specifically implemented according to the method in the method embodiment described in FIG. 1 or FIG. 2. For the specific implementation process, refer to FIG. 1 or FIG. The related description of the method embodiment of 2 will not be repeated here.
本申请实施例中,当通信模块30接收到来自客户端的用于存储目标数据的存储请求时,处理模块31对目标数据进行字段解析,得到目标数据对应的字段以及字段的语义信息,并检测预设索引存储区域中是否存在字段对应的目标索引,若检测到预设索引存储区域中不存在目标索引,则基于语义信息确定目标数据所属的存储结构类型,并通过通信模块30向预连接的搜索服务器发送用于创建目标索引的索引请求,进而接收搜索服务器返回的目标索引,通过处理模块31根据目标索引存储目标数据。采用本申请实施例,可以根据目标数据所属的存储结构类型,创建目标索引,防止对不必要字段的拆分,有利于提高创建索引的效率。In the embodiment of the present application, when the communication module 30 receives a storage request for storing target data from the client, the processing module 31 performs field analysis on the target data, obtains the field corresponding to the target data and the semantic information of the field, and detects the prediction Assuming whether there is a target index corresponding to the field in the index storage area, if it is detected that the target index does not exist in the preset index storage area, the storage structure type to which the target data belongs is determined based on the semantic information, and the pre-connected search is performed through the communication module 30 The server sends an index request for creating a target index, and then receives the target index returned by the search server, and stores the target data according to the target index through the processing module 31. With the embodiment of the present application, the target index can be created according to the storage structure type to which the target data belongs to prevent the splitting of unnecessary fields, which is beneficial to improve the efficiency of index creation.
请参见图4,图4是本申请实施例提供的一种服务器的示意性框图,如图4所示,该服务器包括,处理器401、存储器402和网络接口403。上述处理器401、存储器402和网络接口403可通过总线或其他方式连接,在本申请实施例所示图4中以通过总线连接为例。其中,网络接口403受所述处理器的控制用于收发消息,存储器402用于存储计算机程序,所述计算机程序包括程序指令,处理器401用于执行存储器402存储的程序指令。其中,处理器401被配置用于调用所述程序指令执行:当通过网络接口403接收到来自客户端的用于存储目标数据的存储请求时,对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语 义信息;检测预设索引存储区域中是否存在所述字段对应的目标索引;若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;通过网络接口403向预连接的搜索服务器发送用于创建目标索引的索引请求,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;通过网络接口403接收所述搜索服务器返回的所述目标索引,并根据所述目标索引存储所述目标数据。Please refer to FIG. 4. FIG. 4 is a schematic block diagram of a server according to an embodiment of the present application. As shown in FIG. 4, the server includes a processor 401, a memory 402, and a network interface 403. The processor 401, the memory 402, and the network interface 403 may be connected by a bus or in other ways. In FIG. 4 shown in the embodiment of the present application, the connection by a bus is taken as an example. The network interface 403 is controlled by the processor to send and receive messages, the memory 402 is used to store a computer program, and the computer program includes program instructions, and the processor 401 is used to execute the program instructions stored in the memory 402. The processor 401 is configured to call the program instructions to execute: when receiving a storage request for storing target data from a client through the network interface 403, perform field analysis on the target data to obtain a field analysis result , The field analysis result includes the field corresponding to the target data and the semantic information of the field; detecting whether the target index corresponding to the field exists in the preset index storage area; if it is detected in the preset index storage area If the target index does not exist, the storage structure type to which the target data belongs is determined based on the semantic information; an index request for creating a target index is sent to the pre-connected search server through the network interface 403, and the index request carries all The storage structure type and the target data, so that the search server can create a target index matching the storage structure type for the target data according to the index request; and receive the return from the search server through the network interface 403 The target index, and the target data is stored according to the target index.
在一个实施例中,所述处理器401,具体用于若基于所述语义信息检测到所述目标数据对应的字段用于完整匹配查找,则将所述目标数据所属的存储结构类型确定为关键字类型。In one embodiment, the processor 401 is specifically configured to determine that the storage structure type to which the target data belongs is the key if it is detected based on the semantic information that the field corresponding to the target data is used for complete matching search. Word type.
在一个实施例中,所述处理器401,具体还用于若基于所述语义信息检测到所述目标数据对应的字段用于模糊匹配查找,则将所述目标数据所属的存储结构类型确定为分词类型。In one embodiment, the processor 401 is specifically further configured to determine that the storage structure type to which the target data belongs is if it is detected based on the semantic information that the field corresponding to the target data is used for fuzzy matching search Word segmentation type.
在一个实施例中,所述网络接口403,还用于在所述目标索引中存储所述目标数据之后,向所述客户端发送针对所述目标数据的更新指示信息,所述更新指示信息用于指示客户端按照预设更新策略更新所述目标数据。In one embodiment, the network interface 403 is further configured to send update instruction information for the target data to the client after storing the target data in the target index, and the update instruction information is used Instruct the client to update the target data according to a preset update strategy.
在一个实施例中,所述预设更新策略包括延迟更新策略或者时间更新策略,其中,所述延迟更新策略用于指示所述客户端在检测到针对目标数据的触发操作时,更新所述目标数据;所述时间更新策略,用于指示所述客户端在预设时间后更新所述目标数据。In one embodiment, the preset update strategy includes a delayed update strategy or a time update strategy, wherein the delayed update strategy is used to instruct the client to update the target when a trigger operation for target data is detected. Data; the time update strategy is used to instruct the client to update the target data after a preset time.
在一个实施例中,所述处理器401,还用于向预连接的搜索服务器发送用于创建目标索引的索引请求之后,若在预设时间内未接收到所述搜索服务器返回的所述目标索引时,生成一个预设索引,并根据所述预设索引存储所述目标数据。In one embodiment, the processor 401 is further configured to send an index request for creating a target index to a pre-connected search server, if the target index returned by the search server is not received within a preset time When indexing, a preset index is generated, and the target data is stored according to the preset index.
在一个实施例中,所述处理器401,还用于生成一个预设索引之后,创建用于接收所述搜索服务器返回的目标索引的异步线程;若通过所述异步线程接收到所述搜索服务器返回所述目标索引,则通过所述目标索引更新所述预设索引,并根据所述目标索引存储所述目标数据。In an embodiment, the processor 401 is further configured to create an asynchronous thread for receiving the target index returned by the search server after generating a preset index; if the search server is received through the asynchronous thread When returning to the target index, the preset index is updated through the target index, and the target data is stored according to the target index.
应当理解,在本申请实施例中,所称处理器401可以是中央处理单元 (Central Processing Unit,CPU),该处理器401还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that in the embodiment of the present application, the processor 401 may be a central processing unit (Central Processing Unit, CPU), and the processor 401 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs). ), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
该存储器402可以包括只读存储器和随机存取存储器,并向处理器401提供指令和数据。存储器402的一部分还可以包括非易失性随机存取存储器。例如,存储器402还可以存储设备类型的信息。The memory 402 may include a read-only memory and a random access memory, and provides instructions and data to the processor 401. A part of the memory 402 may also include a non-volatile random access memory. For example, the memory 402 may also store device type information.
具体实现中,本申请实施例中所描述的处理器401、存储器402和网络接口403可执行本申请实施例提供的图1或者图2所述的方法实施例所描述的实现方式,也可执行本申请实施例所描述的数据处理装置的实现方式,在此不再赘述。In specific implementation, the processor 401, memory 402, and network interface 403 described in the embodiment of the present application can execute the implementation described in the method embodiment described in FIG. 1 or FIG. The implementation of the data processing device described in the embodiment of the present application will not be repeated here.
在本申请的另一实施例中提供一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令被处理器执行时实现:当接收到来自客户端的用于存储目标数据的存储请求时,对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语义信息;检测预设索引存储区域中是否存在所述字段对应的目标索引;若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;向预连接的搜索服务器发送用于创建目标索引的索引请求,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;接收所述搜索服务器返回的所述目标索引,并根据所述目标索引存储所述目标数据。In another embodiment of the present application, a computer non-volatile readable storage medium is provided. The computer non-volatile readable storage medium stores a computer program. The computer program includes program instructions. When executed by the processor, it is realized: when a storage request for storing target data is received from a client, the target data is subjected to field analysis to obtain a field analysis result, and the field analysis result includes the data corresponding to the target data. Field and semantic information of the field; detect whether there is a target index corresponding to the field in the preset index storage area; if it is detected that the target index does not exist in the preset index storage area, based on the semantic information Determine the storage structure type to which the target data belongs; send an index request for creating a target index to a pre-connected search server, where the index request carries the storage structure type and the target data, so that the search server can follow The index request creates a target index matching the storage structure type for the target data; receives the target index returned by the search server, and stores the target data according to the target index.
所述计算机非易失性可读存储介质可以是前述任一实施例所述的服务器的内部存储单元,例如服务器的硬盘或内存。所述计算机非易失性可读存储介质也可以是所述服务器的外部存储设备,例如所述服务器上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡, 闪存卡(Flash Card)等。进一步地,所述计算机非易失性可读存储介质还可以既包括所述服务器的内部存储单元也包括外部存储设备。所述计算机非易失性可读存储介质用于存储所述计算机程序以及所述服务器所需的其他程序和数据。所述计算机非易失性可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。The computer non-volatile readable storage medium may be the internal storage unit of the server described in any of the foregoing embodiments, such as the hard disk or memory of the server. The computer non-volatile readable storage medium may also be an external storage device of the server, such as a plug-in hard disk equipped on the server, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital) , SD) card, flash card (Flash Card), etc. Further, the computer non-volatile readable storage medium may also include both an internal storage unit of the server and an external storage device. The computer non-volatile readable storage medium is used to store the computer program and other programs and data required by the server. The computer non-volatile readable storage medium can also be used to temporarily store data that has been output or will be output.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments. Wherein, the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
以上所揭露的仅为本申请的部分实施例而已,当然不能以此来限定本申请之权利范围,本领域普通技术人员可以理解实现上述实施例的全部或部分流程,并依本申请权利要求所作的等同变化,仍属于发明所涵盖的范围。The above-disclosed are only part of the embodiments of this application. Of course, it cannot be used to limit the scope of rights of this application. Those of ordinary skill in the art can understand all or part of the process for implementing the above-mentioned embodiments and make them in accordance with the claims of this application. The equivalent changes still fall within the scope of the invention.

Claims (20)

  1. 一种数据处理方法,其特征在于,所述方法包括:A data processing method, characterized in that the method includes:
    接收来自客户端的用于存储目标数据的存储请求,所述存储请求中包括所述目标数据;Receiving a storage request for storing target data from a client, where the storage request includes the target data;
    对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语义信息;Performing field analysis on the target data to obtain a field analysis result, the field analysis result including the field corresponding to the target data and the semantic information of the field;
    检测预设索引存储区域中是否存在所述字段对应的目标索引;Detecting whether there is a target index corresponding to the field in the preset index storage area;
    若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;If it is detected that the target index does not exist in the preset index storage area, determine the storage structure type to which the target data belongs based on the semantic information;
    向预连接的搜索服务器发送用于创建目标索引的索引请求,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;Send an index request for creating a target index to the pre-connected search server, the index request carrying the storage structure type and the target data, so that the search server can create the target data according to the index request A target index matching the storage structure type;
    接收所述搜索服务器返回的所述目标索引,并根据所述目标索引存储所述目标数据。Receiving the target index returned by the search server, and storing the target data according to the target index.
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述语义信息确定所述目标数据所属的存储结构类型,包括:The method according to claim 1, wherein the determining the storage structure type to which the target data belongs based on the semantic information comprises:
    若基于所述语义信息检测到所述目标数据对应的字段用于完整匹配查找,则将所述目标数据所属的存储结构类型确定为关键字类型。If it is detected based on the semantic information that the field corresponding to the target data is used for complete matching search, the storage structure type to which the target data belongs is determined as the keyword type.
  3. 根据权利要求1所述的方法,其特征在于,所述基于所述语义信息确定所述目标数据所属的存储结构类型,包括:The method according to claim 1, wherein the determining the storage structure type to which the target data belongs based on the semantic information comprises:
    若基于所述语义信息检测到所述目标数据对应的字段用于模糊匹配查找,则将所述目标数据所属的存储结构类型确定为分词类型。If it is detected that the field corresponding to the target data is used for fuzzy matching search based on the semantic information, the storage structure type to which the target data belongs is determined as the word segmentation type.
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述在所述目标索引中存储所述目标数据之后,所述方法还包括:The method according to any one of claims 1 to 3, wherein after storing the target data in the target index, the method further comprises:
    向所述客户端发送针对所述目标数据的更新指示信息,所述更新指示信息用于指示客户端按照预设更新策略更新所述目标数据。Sending update instruction information for the target data to the client, where the update instruction information is used to instruct the client to update the target data according to a preset update strategy.
  5. 根据权利要求4所述的方法,其特征在于,所述预设更新策略包括延迟更新策略或者时间更新策略,其中,所述延迟更新策略用于指示所述客户端 在检测到针对目标数据的触发操作时,更新所述目标数据;所述时间更新策略,用于指示所述客户端在预设时间后更新所述目标数据。The method according to claim 4, wherein the preset update strategy comprises a delayed update strategy or a time update strategy, wherein the delayed update strategy is used to instruct the client to detect a trigger for target data During operation, the target data is updated; the time update strategy is used to instruct the client to update the target data after a preset time.
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述向预连接的搜索服务器发送用于创建目标索引的索引请求之后,所述方法还包括:The method according to any one of claims 1-5, wherein after the sending an index request for creating a target index to a pre-connected search server, the method further comprises:
    若在预设时间内未接收到所述搜索服务器返回的所述目标索引时,生成一个预设索引,并根据所述预设索引存储所述目标数据。If the target index returned by the search server is not received within a preset time, a preset index is generated, and the target data is stored according to the preset index.
  7. 根据权利要求1所述的方法,其特征在于,所述方法应用于包括至少一个节点的服务器集群,所述向预连接的搜索服务器发送用于创建目标索引的索引请求,包括:The method according to claim 1, wherein the method is applied to a server cluster including at least one node, and the sending an index request for creating a target index to a pre-connected search server comprises:
    基于服务器集群中节点的数量确定对目标数据进行分片处理时,所述至少一个服务器中每一个节点各自对应的主片和副片的数量;When sharding the target data is determined based on the number of nodes in the server cluster, the number of primary and secondary slices corresponding to each node in the at least one server;
    根据所述每一个节点各自对应的主片和副片的数量生成用于创建目标索引的索引请求,向预连接的搜索服务器发送所述索引请求,以使所述搜索服务器接收到所述索引请求后,根据所述每一个节点对应的主片和副片数量对所述目标数据进行分片处理,并创建所述每一个节点对应的目标索引。Generate an index request for creating a target index according to the number of main and secondary films corresponding to each node, and send the index request to the pre-connected search server, so that the search server receives the index request Then, the target data is fragmented according to the number of primary and secondary fragments corresponding to each node, and a target index corresponding to each node is created.
  8. 一种数据处理装置,其特征在于,所述装置包括:A data processing device, characterized in that the device includes:
    通信模块,用于接收来自客户端的用于存储目标数据的存储请求,所述存储请求中包括所述目标数据;A communication module, configured to receive a storage request for storing target data from a client, where the storage request includes the target data;
    处理模块,用于对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语义信息;A processing module, configured to perform field analysis on the target data to obtain a field analysis result, where the field analysis result includes a field corresponding to the target data and semantic information of the field;
    所述处理模块,还用于检测预设索引存储区域中是否存在所述字段对应的目标索引,若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;The processing module is further configured to detect whether there is a target index corresponding to the field in the preset index storage area, and if it is detected that the target index does not exist in the preset index storage area, determine based on the semantic information The storage structure type to which the target data belongs;
    所述通信模块,还用于向预连接的搜索服务器发送用于创建目标索引的索引请求,并接收所述搜索服务器返回的所述目标索引,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;The communication module is further configured to send an index request for creating a target index to a pre-connected search server, and receive the target index returned by the search server, and the index request carries the storage structure type and the Target data, so that the search server can create a target index matching the storage structure type for the target data according to the index request;
    所述处理模块,还用于根据所述目标索引存储所述目标数据。The processing module is further configured to store the target data according to the target index.
  9. 根据权利要求8所述的装置,其特征在于,所述处理模块,具体用于 若基于所述语义信息检测到所述目标数据对应的字段用于完整匹配查找,则将所述目标数据所属的存储结构类型确定为关键字类型。The device according to claim 8, wherein the processing module is specifically configured to: if a field corresponding to the target data is detected based on the semantic information for a complete matching search, then the target data belongs to The storage structure type is determined as the keyword type.
  10. 根据权利要求8所述的装置,其特征在于,所述处理模块,还具体用于若基于所述语义信息检测到所述目标数据对应的字段用于模糊匹配查找,则将所述目标数据所属的存储结构类型确定为分词类型。The device according to claim 8, wherein the processing module is further specifically configured to, if a field corresponding to the target data is detected for fuzzy matching search based on the semantic information, then assign the target data to The storage structure type is determined as the word segmentation type.
  11. 根据权利要求8-10任一项所述的装置,其特征在于,所述通信模块,还用于向所述客户端发送针对所述目标数据的更新指示信息,所述更新指示信息用于指示客户端按照预设更新策略更新所述目标数据。The device according to any one of claims 8-10, wherein the communication module is further configured to send update instruction information for the target data to the client, and the update instruction information is used to indicate The client terminal updates the target data according to a preset update strategy.
  12. 根据权利要求11所述的装置,其特征在于,所述预设更新策略包括延迟更新策略或者时间更新策略,其中,所述延迟更新策略用于指示所述客户端在检测到针对目标数据的触发操作时,更新所述目标数据;所述时间更新策略,用于指示所述客户端在预设时间后更新所述目标数据。The device according to claim 11, wherein the preset update strategy comprises a delayed update strategy or a time update strategy, wherein the delayed update strategy is used to instruct the client to detect a trigger for target data During operation, the target data is updated; the time update strategy is used to instruct the client to update the target data after a preset time.
  13. 根据权利要求8-12任一项所述的装置,其特征在于,所述处理模块,还用于若通过所述通信模块在预设时间内未接收到所述搜索服务器返回的所述目标索引时,生成一个预设索引,并根据所述预设索引存储所述目标数据。The device according to any one of claims 8-12, wherein the processing module is further configured to: if the target index returned by the search server is not received within a preset time through the communication module At the time, a preset index is generated, and the target data is stored according to the preset index.
  14. 根据权利要求8所述的装置,其特征在于,所述装置应用于包括至少一个节点的服务器集群,所述处理模块,还具体用于基于服务器集群中节点的数量确定对目标数据进行分片处理时,所述至少一个服务器中每一个节点各自对应的主片和副片的数量,据所述每一个节点各自对应的主片和副片的数量生成用于创建目标索引的索引请求;所述通信模块,还用于向预连接的搜索服务器发送所述索引请求,以使所述搜索服务器接收到所述索引请求后,根据所述每一个节点对应的主片和副片数量对所述目标数据进行分片处理,并创建所述每一个节点对应的目标索引。8. The device according to claim 8, wherein the device is applied to a server cluster including at least one node, and the processing module is further specifically configured to determine that the target data is fragmented based on the number of nodes in the server cluster When the number of primary and secondary pieces corresponding to each node in the at least one server, an index request for creating a target index is generated according to the number of primary and secondary pieces corresponding to each node; The communication module is further configured to send the index request to the pre-connected search server, so that the search server, after receiving the index request, compares the target with the number of primary and secondary movies corresponding to each node The data is fragmented, and a target index corresponding to each node is created.
  15. 一种服务器,其特征在于,包括处理器、存储器和网络接口,所述处理器、所述存储器和所述网络接口相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令执行:当通过所述网络接口接收到来自客户端的用于存储目标数据的存储请求时,对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语义信息;检测预设索引存 储区域中是否存在所述字段对应的目标索引;若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;通过所述网络接口向预连接的搜索服务器发送用于创建目标索引的索引请求,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;通过所述网络接口接收所述搜索服务器返回的所述目标索引,并根据所述目标索引存储所述目标数据。A server, characterized by comprising a processor, a memory, and a network interface, the processor, the memory, and the network interface are connected to each other, wherein the memory is used to store a computer program, and the computer program includes a program Instructions, the processor is configured to call the program instructions to execute: when receiving a storage request for storing target data from a client through the network interface, perform field analysis on the target data to obtain a field Parsing result, the field parsing result includes the field corresponding to the target data and the semantic information of the field; detecting whether there is a target index corresponding to the field in the preset index storage area; if it is detected that the preset index is stored If the target index does not exist in the area, the storage structure type to which the target data belongs is determined based on the semantic information; an index request for creating a target index is sent to a pre-connected search server through the network interface, and the index The request carries the storage structure type and the target data, so that the search server can create a target index matching the storage structure type for the target data according to the index request; receiving the storage structure type through the network interface Search for the target index returned by the server, and store the target data according to the target index.
  16. 根据权利要求15所述的服务器,其特征在于,所述处理器,具体用于若基于所述语义信息检测到所述目标数据对应的字段用于完整匹配查找,则将所述目标数据所属的存储结构类型确定为关键字类型。The server according to claim 15, wherein the processor is specifically configured to: if a field corresponding to the target data is detected based on the semantic information for a complete matching search, then the target data belongs to The storage structure type is determined as the keyword type.
  17. 根据权利要求15所述的服务器,其特征在于,所述处理器,还具体用于若基于所述语义信息检测到所述目标数据对应的字段用于模糊匹配查找,则将所述目标数据所属的存储结构类型确定为分词类型。The server according to claim 15, wherein the processor is further specifically configured to: if a field corresponding to the target data is detected for fuzzy matching search based on the semantic information, then assign the target data to The storage structure type is determined as the word segmentation type.
  18. 根据权利要求15-17任一项所述的服务器,其特征在于,所述处理器还用于通过所述网络接口向所述客户端发送针对所述目标数据的更新指示信息,所述更新指示信息用于指示客户端按照预设更新策略更新所述目标数据。The server according to any one of claims 15-17, wherein the processor is further configured to send update instruction information for the target data to the client through the network interface, and the update instruction The information is used to instruct the client to update the target data according to a preset update strategy.
  19. 根据权利要求18所述的服务器,其特征在于,所述预设更新策略包括延迟更新策略或者时间更新策略,其中,所述延迟更新策略用于指示所述客户端在检测到针对目标数据的触发操作时,更新所述目标数据;所述时间更新策略,用于指示所述客户端在预设时间后更新所述目标数据。The server according to claim 18, wherein the preset update strategy comprises a delayed update strategy or a time update strategy, wherein the delayed update strategy is used to instruct the client to detect a trigger for target data During operation, the target data is updated; the time update strategy is used to instruct the client to update the target data after a preset time.
  20. 一种计算机非易失性可读存储介质,其特征在于,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序被处理器执行以实现权利要求1至7任意一项所述的方法。A computer nonvolatile readable storage medium, wherein the computer nonvolatile readable storage medium stores a computer program, and the computer program is executed by a processor to implement any one of claims 1 to 7 The method described.
PCT/CN2019/120960 2019-07-25 2019-11-26 Data processing method and related device WO2021012553A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910679327.9 2019-07-25
CN201910679327.9A CN110489417B (en) 2019-07-25 2019-07-25 Data processing method and related equipment

Publications (1)

Publication Number Publication Date
WO2021012553A1 true WO2021012553A1 (en) 2021-01-28

Family

ID=68548292

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/120960 WO2021012553A1 (en) 2019-07-25 2019-11-26 Data processing method and related device

Country Status (2)

Country Link
CN (1) CN110489417B (en)
WO (1) WO2021012553A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100152A (en) * 2020-09-14 2020-12-18 广州华多网络科技有限公司 Service data processing method, system, server and readable storage medium
CN112948016A (en) * 2021-02-25 2021-06-11 京东数字科技控股股份有限公司 Configuration information generation method, device and equipment
CN113190623A (en) * 2021-05-14 2021-07-30 京东数科海益信息科技有限公司 Data processing method, device, server and storage medium
CN113392081A (en) * 2021-06-10 2021-09-14 北京猿力未来科技有限公司 Data processing system and method
CN116737428A (en) * 2023-08-14 2023-09-12 中科三清科技有限公司 Air quality mode operation stability checking method and device and electronic equipment
CN116842223A (en) * 2023-08-29 2023-10-03 天津鑫宝龙电梯集团有限公司 Working condition data management method, device, equipment and medium
WO2023185401A1 (en) * 2022-03-28 2023-10-05 华为技术有限公司 Data processing method, encoding and decoding accelerator and related device
CN116910260A (en) * 2023-09-13 2023-10-20 中国标准化研究院 Digital asset searching method based on big data
CN117076542A (en) * 2023-08-29 2023-11-17 中国中金财富证券有限公司 Data processing method and related device
CN117896440A (en) * 2024-03-15 2024-04-16 江西曼荼罗软件有限公司 Data caching acquisition method and system

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489417B (en) * 2019-07-25 2023-03-28 深圳壹账通智能科技有限公司 Data processing method and related equipment
CN112988692B (en) * 2019-12-13 2024-05-07 阿里巴巴集团控股有限公司 Data processing method and device
CN111125176B (en) * 2019-12-20 2023-10-03 北京百度网讯科技有限公司 Service data searching method and device, electronic equipment and storage medium
CN111274350B (en) * 2020-02-03 2023-06-23 广州极尚网络技术有限公司 Data processing method, device, computer equipment and storage medium
CN111914126A (en) * 2020-07-22 2020-11-10 浙江乾冠信息安全研究院有限公司 Processing method, equipment and storage medium for indexed network security big data
CN111949479B (en) * 2020-07-31 2023-08-25 中国工商银行股份有限公司 Interactive system and index creation condition determining method and equipment
CN112100414B (en) * 2020-09-11 2024-02-23 深圳力维智联技术有限公司 Data processing method, device, system and computer readable storage medium
CN113760931B (en) * 2021-08-20 2023-12-29 济南浪潮数据技术有限公司 Resource information access method, device, equipment and medium
CN113626443B (en) * 2021-08-26 2024-03-15 企查查科技股份有限公司 Index data processing method, device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105988996A (en) * 2015-01-27 2016-10-05 腾讯科技(深圳)有限公司 Index file generation method and device
CN106326295A (en) * 2015-07-01 2017-01-11 中兴通讯股份有限公司 Method and device for storing semantic data
WO2018104274A1 (en) * 2016-12-08 2018-06-14 Bundesdruckerei Gmbh Database index comprising multiple fields
CN110019646A (en) * 2017-10-12 2019-07-16 北京京东尚科信息技术有限公司 A kind of method and apparatus for establishing index
CN110489417A (en) * 2019-07-25 2019-11-22 深圳壹账通智能科技有限公司 A kind of data processing method and relevant device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10061807B2 (en) * 2012-05-18 2018-08-28 Splunk Inc. Collection query driven generation of inverted index for raw machine data
US20160154851A1 (en) * 2013-04-24 2016-06-02 Hitachi Ltd. Computing device, storage medium, and data search method
CN110019211A (en) * 2017-11-27 2019-07-16 北京京东尚科信息技术有限公司 The methods, devices and systems of association index
CN108874924B (en) * 2018-05-31 2022-11-04 康键信息技术(深圳)有限公司 Method and device for creating search service and computer-readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105988996A (en) * 2015-01-27 2016-10-05 腾讯科技(深圳)有限公司 Index file generation method and device
CN106326295A (en) * 2015-07-01 2017-01-11 中兴通讯股份有限公司 Method and device for storing semantic data
WO2018104274A1 (en) * 2016-12-08 2018-06-14 Bundesdruckerei Gmbh Database index comprising multiple fields
CN110019646A (en) * 2017-10-12 2019-07-16 北京京东尚科信息技术有限公司 A kind of method and apparatus for establishing index
CN110489417A (en) * 2019-07-25 2019-11-22 深圳壹账通智能科技有限公司 A kind of data processing method and relevant device

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100152A (en) * 2020-09-14 2020-12-18 广州华多网络科技有限公司 Service data processing method, system, server and readable storage medium
CN112948016A (en) * 2021-02-25 2021-06-11 京东数字科技控股股份有限公司 Configuration information generation method, device and equipment
CN113190623A (en) * 2021-05-14 2021-07-30 京东数科海益信息科技有限公司 Data processing method, device, server and storage medium
CN113190623B (en) * 2021-05-14 2024-05-17 京东科技信息技术有限公司 Data processing method, device, server and storage medium
CN113392081A (en) * 2021-06-10 2021-09-14 北京猿力未来科技有限公司 Data processing system and method
WO2023185401A1 (en) * 2022-03-28 2023-10-05 华为技术有限公司 Data processing method, encoding and decoding accelerator and related device
CN116737428A (en) * 2023-08-14 2023-09-12 中科三清科技有限公司 Air quality mode operation stability checking method and device and electronic equipment
CN116737428B (en) * 2023-08-14 2023-11-21 中科三清科技有限公司 Air quality mode operation stability checking method and device and electronic equipment
CN116842223B (en) * 2023-08-29 2023-11-10 天津鑫宝龙电梯集团有限公司 Working condition data management method, device, equipment and medium
CN117076542A (en) * 2023-08-29 2023-11-17 中国中金财富证券有限公司 Data processing method and related device
CN116842223A (en) * 2023-08-29 2023-10-03 天津鑫宝龙电梯集团有限公司 Working condition data management method, device, equipment and medium
CN117076542B (en) * 2023-08-29 2024-06-07 中国中金财富证券有限公司 Data processing method and related device
CN116910260B (en) * 2023-09-13 2023-11-17 中国标准化研究院 Digital asset searching method based on big data
CN116910260A (en) * 2023-09-13 2023-10-20 中国标准化研究院 Digital asset searching method based on big data
CN117896440A (en) * 2024-03-15 2024-04-16 江西曼荼罗软件有限公司 Data caching acquisition method and system
CN117896440B (en) * 2024-03-15 2024-05-24 江西曼荼罗软件有限公司 Data caching acquisition method and system

Also Published As

Publication number Publication date
CN110489417A (en) 2019-11-22
CN110489417B (en) 2023-03-28

Similar Documents

Publication Publication Date Title
WO2021012553A1 (en) Data processing method and related device
US10757106B2 (en) Resource access control method and device
WO2019192103A1 (en) Concurrent access control method and apparatus, terminal device, and medium
US11418525B2 (en) Data processing method, device and storage medium
WO2021012568A1 (en) Data processing method and related device
US9953639B2 (en) Voice recognition system and construction method thereof
WO2018040722A1 (en) Table data query method and device
US9838422B2 (en) Detecting denial-of-service attacks on graph databases
US10810056B2 (en) Adding descriptive metadata to application programming interfaces for consumption by an intelligent agent
US20110302277A1 (en) Methods and apparatus for web-based migration of data in a multi-tenant database system
US10235476B2 (en) Matching objects using match rules and lookup key
WO2023231341A1 (en) Method and apparatus for discovering data asset risk
US11477179B2 (en) Searching content associated with multiple applications
WO2017167208A1 (en) Method and apparatus for recognizing malicious website, and computer storage medium
WO2021022714A1 (en) Message processing method for cross-block chain node, device, apparatus and medium
WO2017121355A1 (en) Search processing method and device
WO2023165226A1 (en) Application resource backup method and apparatus, electronic device, and storage medium
EP3997589A1 (en) Delta graph traversing system
WO2015154416A1 (en) Internet access behaviour management method and device
WO2016169212A1 (en) File management method and device
US20230315741A1 (en) Federation of data during query time in computing systems
US10839028B2 (en) System for querying web pages using a real time entity authentication engine
US11496444B1 (en) Enforcing access control to resources of an indexing system using resource paths
US20230153457A1 (en) Privacy data management in distributed computing systems
US10423992B2 (en) Method, system, and medium for event based versioning and visibility for content releases

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19938890

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19938890

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09.08.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19938890

Country of ref document: EP

Kind code of ref document: A1