一种数据处理方法及相关设备A data processing method and related equipment
本申请要求于2019年7月25日提交中国专利局、申请号为201910679327.9、申请名称为“一种数据处理方法及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is 201910679327.9, and the application name is "a data processing method and related equipment" on July 25, 2019, the entire content of which is incorporated into this application by reference in.
技术领域Technical field
本申请涉及数据处理技术领域,尤其涉及一种数据处理方法及相关设备。This application relates to the field of data processing technology, and in particular to a data processing method and related equipment.
背景技术Background technique
目前,许多数据管理平台均可以提供通过关键词进行目标数据查询的功能。如果需要使用目标数据查询的功能,首先需要在服务器对应的存储区域创建该目标数据对应的索引,在索引创建完成后,可以通过关键词查询该目标数据。其中,该关键词是通过对目标数据进行分词处理后得到的。At present, many data management platforms can provide the function of querying target data through keywords. If you need to use the target data query function, you first need to create an index corresponding to the target data in the storage area corresponding to the server. After the index is created, you can query the target data through keywords. Among them, the keyword is obtained by performing word segmentation processing on the target data.
传统服务对目标数据的存储一般都存储到关系型数据库,例如MySql、Oracle等。如果服务器通过关系型数据库创建目标数据对应的索引,从而实现目标数据的查询功能,需要另外维护一套数据库服务,无法对非结构化的目标数据进行很好的分词处理,从而创建索引。因此,如何更加高效地创建索引,存储目标数据,从而实现目标数据的查询,成为一个亟待解决的问题。The storage of target data in traditional services is generally stored in relational databases, such as MySql and Oracle. If the server creates an index corresponding to the target data through a relational database to realize the query function of the target data, a set of database services needs to be maintained, and the unstructured target data cannot be well segmented to create an index. Therefore, how to create an index more efficiently and store the target data so as to realize the query of the target data has become an urgent problem to be solved.
发明内容Summary of the invention
本申请实施例提供了一种数据处理方法及相关设备,有利于提高创建索引的效率。The embodiment of the present application provides a data processing method and related equipment, which is beneficial to improve the efficiency of index creation.
第一方面,本申请实施例提供了一种数据处理方法,所述方法应用于服务器,该方法包括:In the first aspect, an embodiment of the present application provides a data processing method, which is applied to a server, and the method includes:
接收来自客户端的用于存储目标数据的存储请求,所述存储请求中包括所述目标数据;Receiving a storage request for storing target data from a client, where the storage request includes the target data;
对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语义信息;Performing field analysis on the target data to obtain a field analysis result, the field analysis result including the field corresponding to the target data and the semantic information of the field;
检测预设索引存储区域中是否存在所述字段对应的目标索引;Detecting whether there is a target index corresponding to the field in the preset index storage area;
若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;If it is detected that the target index does not exist in the preset index storage area, determine the storage structure type to which the target data belongs based on the semantic information;
向预连接的搜索服务器发送用于创建目标索引的索引请求,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;Send an index request for creating a target index to the pre-connected search server, the index request carrying the storage structure type and the target data, so that the search server can create the target data according to the index request A target index matching the storage structure type;
接收所述搜索服务器返回的所述目标索引,并根据所述目标索引存储所述目标数据。Receiving the target index returned by the search server, and storing the target data according to the target index.
第二方面,本申请实施例提供了一种数据处理装置,该数据处理装置包括用于执行上述第一方面的方法的模块。In a second aspect, an embodiment of the present application provides a data processing device, which includes a module for executing the method of the first aspect.
第三方面,本申请实施例提供了一种服务器,该服务器包括处理器、网络接口和存储器,所述处理器、网络接口和存储器相互连接,其中,所述网络接口受所述处理器的控制用于收发消息,所述存储器用于存储支持服务器执行上述方法的计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行上述第一方面的方法。In a third aspect, an embodiment of the present application provides a server, the server includes a processor, a network interface, and a memory, the processor, the network interface, and the memory are connected to each other, wherein the network interface is controlled by the processor Used to send and receive messages, the memory is used to store a computer program that supports the server to execute the above method, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method of the above first aspect.
第四方面,本申请实施例提供了一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行上述第一方面的方法。In a fourth aspect, an embodiment of the present application provides a computer non-volatile readable storage medium, the computer non-volatile readable storage medium stores a computer program, the computer program includes program instructions, and the program instructions When executed by a processor, the processor is caused to execute the method of the first aspect.
本申请实施例中,服务器可以根据目标数据所属的存储结构类型,创建目标索引,防止对不必要字段的拆分,有利于提高创建索引的效率。In the embodiment of the present application, the server can create a target index according to the storage structure type to which the target data belongs to prevent the splitting of unnecessary fields, which is beneficial to improve the efficiency of index creation.
附图说明Description of the drawings
图1是本申请实施例提供的一种数据处理方法的流程示意图;FIG. 1 is a schematic flowchart of a data processing method provided by an embodiment of the present application;
图2是本申请实施例提供的另一种数据处理方法的流程示意图;2 is a schematic flowchart of another data processing method provided by an embodiment of the present application;
图3是本申请实施例提供的一种数据处理装置的示意性框图;FIG. 3 is a schematic block diagram of a data processing device provided by an embodiment of the present application;
图4是本申请实施例提供的一种服务器的示意性框图。Fig. 4 is a schematic block diagram of a server provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清 楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
参见图1,图1是本申请实施例提供的一种数据处理方法的流程示意图,该方法应用于服务器,可由服务器执行,如图所示,该数据处理方法可包括:Referring to Fig. 1, Fig. 1 is a schematic flowchart of a data processing method provided by an embodiment of the present application. The method is applied to a server and can be executed by the server. As shown in the figure, the data processing method may include:
101:接收来自客户端的用于存储目标数据的存储请求,该存储请求中包括目标数据。101: Receive a storage request for storing target data from a client, where the storage request includes the target data.
102:对目标数据进行字段解析,以得到字段解析结果,该字段解析结果包括所述目标数据对应的字段以及字段的语义信息。102: Perform field analysis on the target data to obtain a field analysis result, where the field analysis result includes the field corresponding to the target data and the semantic information of the field.
其中,上述服务器可以数据管理平台对应的服务器,该服务器可以为一台服务器,也可以为多台服务器组成的服务器集群,该服务器可以提供数据管理的相关服务。例如,该数据管理平台可以为日志平台,该日志云平台可以提供通过关键词进行日志查询的功能。其中,该客户端可以为日志平台对应的应用或者网站,也可以为安装有日志平台应用或者开启日志平台网站的终端设备。在一个实施例中,该目标数据可以为非结构化的数据。The above-mentioned server may be a server corresponding to the data management platform, and the server may be one server or a server cluster composed of multiple servers, and the server may provide related services of data management. For example, the data management platform may be a log platform, and the log cloud platform may provide a log query function through keywords. Wherein, the client can be an application or website corresponding to the log platform, or a terminal device installed with the log platform application or opening the log platform website. In one embodiment, the target data may be unstructured data.
103:检测预设索引存储区域中是否存在所述字段对应的目标索引,若检测到预设索引存储区域中不存在目标索引,则基于语义信息确定目标数据所属的存储结构类型。103: Detect whether there is a target index corresponding to the field in the preset index storage area, and if it is detected that the target index does not exist in the preset index storage area, determine the storage structure type to which the target data belongs based on semantic information.
其中,服务器为了实现目标数据的查询功能,服务器需要预先在索引存储区域中创建目标数据对应的索引,创建索引之后,可以通过关键词进行目标数据的查询,其中,该索引与关键词具有对应关系。在一个实施例中,预设索引存储区域中包括至少一个索引,每个索引对应有关键词。针对这种情况,当服务器接收到来自客户端的用于存储目标数据的存储请求时,可以对目标数据进行字段解析,得到目标数据对应的至少一个字段。进一步地,可以将上述至少一个字段与预先存储的各个索引的关键词进行对比,若对比得到至少一个字段中的任一字段与任一索引的关键词匹配,则确定预设索引存储区域中存在目标索引。Among them, in order for the server to realize the query function of the target data, the server needs to create an index corresponding to the target data in the index storage area in advance. After the index is created, the target data can be queried by keywords, where the index and the keyword have a corresponding relationship . In one embodiment, the preset index storage area includes at least one index, and each index corresponds to a keyword. In response to this situation, when the server receives a storage request for storing target data from the client, it may perform field analysis on the target data to obtain at least one field corresponding to the target data. Further, the above-mentioned at least one field can be compared with the pre-stored keywords of each index, and if any field in the at least one field matches a keyword of any index, it is determined that there is a preset index storage area Target index.
相反地,若服务器检测到上述至少一个字段中不存在与任一索引的关键词匹配的字段,则确定预设索引存储区域中不存在目标索引。On the contrary, if the server detects that there is no field matching the keyword of any index in the at least one field, it determines that the target index does not exist in the preset index storage area.
其中,上述存储结构类型可以包括关键字类型和分词类型。在一个实施例中,当服务器检测到预设索引存储区域中不存在目标索引时,可以检测目标数据对应各个字段的语义信息,若基于该语义信息检测到目标数据对应任一字段用于完整匹配查找,则将目标数据所属的存储结构类型确定为关键字类型。Among them, the foregoing storage structure types may include keyword types and word segmentation types. In one embodiment, when the server detects that there is no target index in the preset index storage area, it can detect the semantic information of the target data corresponding to each field, and if it is detected based on the semantic information that the target data corresponds to any field for complete matching Search, the storage structure type to which the target data belongs is determined as the keyword type.
若基于语义信息检测到目标数据对应的任一字段用于模糊匹配查找,则将目标数据所属的存储结构类型确定为分词类型。其中,用于完整匹配查找的字段可以称为第一字段,该第一字段具有唯一性,例如该第一字段的语义信息可以表征用户姓名、证件号等,每一个用户仅对应一个用户姓名,以及证件号;其中,用于模糊查找的字段可以称为第二字段,该第二字段的语义信息不具有唯一性,例如该第二字段的语义信息可以表征公司名称等,该公司名称可以对应多个用户。采用这样的方式,服务器可以根据不同的目标数据,创建不同的存储结构,可以避免搜索服务器在创建目标索引时,对不必要的字符进行拆分,可以有效提高数据处理效率。If it is detected that any field corresponding to the target data is used for fuzzy matching search based on the semantic information, the storage structure type to which the target data belongs is determined as the word segmentation type. Among them, the field used for complete matching search can be called the first field, and the first field is unique. For example, the semantic information of the first field can represent the user name, certificate number, etc., and each user corresponds to only one user name. And the certificate number; among them, the field used for fuzzy search can be called the second field, and the semantic information of the second field is not unique. For example, the semantic information of the second field can represent the company name, etc., and the company name can correspond to Multiple users. In this way, the server can create different storage structures according to different target data, which can avoid the search server from splitting unnecessary characters when creating the target index, and can effectively improve the efficiency of data processing.
104:向预连接的搜索服务器发送用于创建目标索引的索引请求,该索引请求携带存储结构类型以及目标数据,以便于搜索服务器根据该索引请求,为目标数据创建与该存储结构类型匹配的目标索引。104: Send an index request for creating a target index to the pre-connected search server. The index request carries the storage structure type and target data, so that the search server can create a target matching the storage structure type for the target data according to the index request index.
105:接收搜索服务器返回的目标索引,并根据目标索引存储目标数据。105: Receive the target index returned by the search server, and store the target data according to the target index.
在一个实施例中,可以预先配置针对搜索服务器的配置文件,该配置文件中包含了关于连接搜索服务器的地址、端口、协议、连接超时时间、协议的路由数和最大连接数等相关配置。进一步地,当服务器检测到上述搜索服务器启动时,可以基于上述配置文件在搜索服务器注册,从而实现后续与搜索服务器之间的数据交互,即与搜索服务器建立连接。In one embodiment, a configuration file for the search server may be pre-configured, and the configuration file contains the address, port, protocol, connection timeout period, the number of routes of the protocol, and the maximum number of connections related to the connection to the search server. Further, when the server detects that the search server is started, it can register with the search server based on the configuration file, so as to realize subsequent data interaction with the search server, that is, establish a connection with the search server.
其中,该搜索服务器可以为ElasticSearch,该ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口。Elasticsearch是用Java开发的,并作为Apache许可条款下的开放源码发布,是当前流行的企业级搜索引擎。设计用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。Wherein, the search server may be ElasticSearch, which is a search server based on Lucene. It provides a full-text search engine with distributed multi-user capabilities, based on a RESTful web interface. Elasticsearch is developed in Java and released as an open source under the terms of the Apache license. It is a popular enterprise search engine. Designed for cloud computing, it can achieve real-time search, stable, reliable, fast, easy to install and use.
在一个实施例中,当服务器确定出目标数据所属的存储结构类型之后,可以向以建立连接的搜索服务器发送创建目标索引的索引请求,该索引请求包括 目标数据的存储结构类型以及该目标数据。进一步地,搜索服务器可以基于该目标数据以及存储结构类型,自动为该目标数据创建出与该存储结构类型匹配的目标索引,并将该目标索引返回服务器。服务器接收到搜索服务器返回的目标索引后,可以在目标索引对应的存储区域中存储该目标数据,并为该目标数据分配关键字,以便于后续使用该关键字,查询到目标数据。其中,该目标索引对应的存储区域可以为磁盘或者文件夹。In one embodiment, after the server determines the storage structure type to which the target data belongs, it may send an index request for creating a target index to the search server to which the connection is established. The index request includes the storage structure type of the target data and the target data. Further, the search server can automatically create a target index matching the storage structure type for the target data based on the target data and the storage structure type, and return the target index to the server. After the server receives the target index returned by the search server, it can store the target data in the storage area corresponding to the target index, and assign keywords to the target data, so that the keyword can be subsequently used to query the target data. Wherein, the storage area corresponding to the target index may be a disk or a folder.
在一个实施例中,上述数据处理方法可应用于一个数据管理平台对应的插件,该插件插入的对象为搜索服务器。示例性地,该数据管理平台为一个日志平台,该插件可以在搜索服务器的服务文件夹下创建属于自己服务的文件夹,该文件夹中包括以jar包形式存在的日志云插件、当前插件的运行信息和日志平台需要的配置文件。其中,该运行信息包括:插件的描述信息,用来描述该插件的作用;插件的版本信息;插件在搜索服务器中显示的名称;插件的入口,插件采用的java版本信息;插件发布到搜索服务器对应的特定版本。In one embodiment, the above data processing method can be applied to a plug-in corresponding to a data management platform, and the plug-in is inserted into the search server. Exemplarily, the data management platform is a log platform, and the plug-in can create a folder of its own service under the service folder of the search server. The folder includes the log cloud plug-in in the form of a jar package and the current plug-in Configuration files required by the operation information and log platform. Among them, the operation information includes: the description information of the plug-in, which is used to describe the role of the plug-in; the version information of the plug-in; the name of the plug-in displayed in the search server; the entrance of the plug-in, the java version information used by the plug-in; The corresponding specific version.
在一个实施例中,若检测到上述预设索引存储区域中存在目标索引,可以直接在该目标索引中存储该目标数据。In one embodiment, if it is detected that there is a target index in the aforementioned preset index storage area, the target data may be directly stored in the target index.
本申请实施例中,服务器可以根据目标数据所属的存储结构类型,创建目标索引,防止对不必要字段的拆分,有利于提高创建索引的效率。In the embodiment of the present application, the server can create a target index according to the storage structure type to which the target data belongs to prevent the splitting of unnecessary fields, which is beneficial to improve the efficiency of index creation.
参见图2,图2是本申请实施例提供的另一种数据处理方法的流程示意图,该方法可由服务器执行,如图所示,该数据处理方法可包括:Referring to Figure 2, Figure 2 is a schematic flowchart of another data processing method provided by an embodiment of the present application. The method can be executed by a server. As shown in the figure, the data processing method can include:
201:当接收到来自客户端的用于存储目标数据的存储请求时,对目标数据进行字段解析,以得到字段解析结果,该字段解析结果包括所述目标数据对应的字段以及字段的语义信息。201: When receiving a storage request for storing target data from a client, perform field analysis on the target data to obtain a field analysis result. The field analysis result includes the field corresponding to the target data and the semantic information of the field.
202:检测预设索引存储区域中是否存在所述字段对应的目标索引,若检测到预设索引存储区域中不存在目标索引,则基于语义信息确定目标数据所属的存储结构类型。202: Detect whether there is a target index corresponding to the field in the preset index storage area, and if it is detected that the target index does not exist in the preset index storage area, determine the storage structure type to which the target data belongs based on semantic information.
203:向预连接的搜索服务器发送用于创建目标索引的索引请求,该索引请求携带存储结构类型以及目标数据,以便于搜索服务器根据该索引请求,为目标数据创建与该存储结构类型匹配的目标索引。203: Send an index request for creating a target index to the pre-connected search server. The index request carries the storage structure type and target data, so that the search server can create a target matching the storage structure type for the target data according to the index request. index.
204:接收搜索服务器返回的目标索引,并根据目标索引存储目标数据。204: Receive the target index returned by the search server, and store the target data according to the target index.
其中,上述步骤201~步骤204的具体实施方式,可以参见上述实施例中步骤101~步骤105的相关描述,此处不再赘述。For the specific implementation manners of step 201 to step 204 above, reference may be made to the related description of step 101 to step 105 in the above embodiment, which will not be repeated here.
205:向上述客户端发送针对目标数据的更新指示信息,该更新指示信息用于指示客户端按照预设更新策略更新目标数据。205: Send update instruction information for the target data to the foregoing client, where the update instruction information is used to instruct the client to update the target data according to a preset update strategy.
在一个实施例中,上述预设更新策略可以包括延迟更新策略或者时间更新策略,其中,上述延迟更新策略用于指示客户端在检测到针对目标数据的触发操作时,更新目标数据;该时间更新策略,用于指示客户端在预设时间后更新目标数据。其中,该触发操作可以为搜索目标数据的搜索操作,也可以为查看目标数据的查看操作等等,或者,其他针对目标数据的任何操作,本申请对此不做具体限定。In one embodiment, the aforementioned preset update strategy may include a delayed update strategy or a time update strategy, where the aforementioned delayed update strategy is used to instruct the client to update the target data when a trigger operation for the target data is detected; the time update The strategy is used to instruct the client to update the target data after a preset time. The trigger operation may be a search operation for searching target data, a viewing operation for viewing target data, etc., or any other operation for target data, which is not specifically limited in this application.
在一个实施例中,该预设时间可以为0s,1s等等,均可以为预先由开发人员默认设置的,也可以由用户根据自身需求选择的。或者,当该预设时间已经确定后,用户也可以根据自身需求据对该预设时间进行调整。其中,该0s可以理解为立即刷新。In one embodiment, the preset time can be 0s, 1s, etc., which can be preset by the developer by default, or can be selected by the user according to their own needs. Or, after the preset time has been determined, the user can also adjust the preset time according to his own needs. Among them, the 0s can be understood as refreshing immediately.
示例性地,上述预设更新策略为时间更新策略,该时间更新策略,用于指示客户端在0s后更新目标数据,这种情况下,客户端接收到针对目标数据的更新指示信息后,可以立即更新该目标数据。Exemplarily, the above-mentioned preset update strategy is a time update strategy, which is used to instruct the client to update the target data after 0s. In this case, after the client receives the update instruction information for the target data, it can Update the target data immediately.
在一个实施例中,当上述预设更新策略为延迟更新策略时,上述延迟更新策略用于指示客户端在检测到针对目标数据的查看操作时,更新目标数据。其中,该查看操作例如可以为针对查看按钮的触控操作,或者用于查看目标数据的语音信号等等。采用这样的方式,不需要立即更新目标数据大量消耗硬件的性能,延迟刷新会在下次对目标数据存在触发操作时进行刷新,保证了硬件性能的保护和用户及时可以查询到保存的目标数据。In one embodiment, when the aforementioned preset update strategy is a delayed update strategy, the aforementioned delayed update strategy is used to instruct the client to update the target data when it detects a viewing operation for the target data. Wherein, the viewing operation may be, for example, a touch operation on a viewing button, or a voice signal used to view target data, and so on. In this way, there is no need to update the target data immediately, which consumes a lot of hardware performance. Delayed refresh will refresh the target data next time there is a trigger operation to ensure the protection of hardware performance and users can query the saved target data in time.
在一个实施例中,向预连接的搜索服务器发送用于创建目标索引的索引请求之后,若在预设时间内未接收到搜索服务器返回的目标索引时,可以生成一个预设索引,并根据预设索引存储该目标数据In one embodiment, after sending an index request for creating a target index to a pre-connected search server, if the target index returned by the search server is not received within a preset time, a preset index may be generated, and based on the preset index Set index to store the target data
在一个实施例中,可以预先设置等待搜索服务器返回目标索引的等待时长(即,预设时间),当向预连接的搜索服务器发送用于创建目标索引的索引请 求时,开启计时器进行计时,若检测到计时器对应的当前时长大于或者等于该等待时长时,还未接收到返回的目标索引,则可以获取一个预设索引,并在预设索引对应的存储区域存储该目标数据。In one embodiment, the waiting time (ie, preset time) for waiting for the search server to return the target index can be preset, and when an index request for creating the target index is sent to the pre-connected search server, a timer is started for timing, If it is detected that the current duration corresponding to the timer is greater than or equal to the waiting duration, and the returned target index has not been received, a preset index may be obtained, and the target data may be stored in the storage area corresponding to the preset index.
在一个实施例中,服务器生成一个预设索引之后,还可以创建用于接收搜索服务器返回的目标索引的异步线程,也即,开启一个新的线程,用于继续等待搜索服务器返回目标索引。进一步地,若服务器通过上述异步线程接收到搜索服务器返回的目标索引,则通过目标索引更新预设索引,并根据目标索引存储目标数据。In an embodiment, after the server generates a preset index, it can also create an asynchronous thread for receiving the target index returned by the search server, that is, start a new thread for continuing to wait for the search server to return the target index. Further, if the server receives the target index returned by the search server through the aforementioned asynchronous thread, the preset index is updated through the target index, and the target data is stored according to the target index.
在一个实施例中,可以预先对接收搜索服务器返回目标索引的事件注册一个回调监听器,当服务器在预设时间内未接收到搜索服务器返回的目标索引时,可以开启该回调监听器(即创建一个上述异步线程),用于接收该搜索服务器返回的目标索引。In one embodiment, a callback listener may be registered in advance for the event of receiving the target index returned by the search server. When the server does not receive the target index returned by the search server within a preset time, the callback listener may be opened (ie, create One of the aforementioned asynchronous threads) is used to receive the target index returned by the search server.
在一个实施例中,本申请实施例中的数据处理方法应用于服务器集群,该服务器集群部署有N个节点(如N个服务器)。这种情况下,在向预连接的搜索服务器发送用于创建目标索引的索引请求之前,还可以基于服务器集群中节点的数量确定对目标数据进行分片过程中,每一个节点对应的主片和副片数量,并将每一个节点对应的主片和副片数量添加至索引请求,发送至搜索服务器。进一步地,搜索服务器接收到该索引请求后,可以根据每一个节点对应的主片和副片数量对目标数据进行分片处理,并创建每一个节点各自对应的目标索引,创建完成后,将该目标索引返回至各自对应的节点。采用这样的方式,可以根据集群中部署节点的个数去设置主片和副片的数量,有效的减少了过多副片数量对硬件资源的浪费。In one embodiment, the data processing method in the embodiment of the present application is applied to a server cluster, and the server cluster is deployed with N nodes (such as N servers). In this case, before sending the index request for creating the target index to the pre-connected search server, the target data can be determined based on the number of nodes in the server cluster. The number of secondary movies, and the number of primary and secondary movies corresponding to each node are added to the index request, and sent to the search server. Further, after receiving the index request, the search server can slice the target data according to the number of primary and secondary slices corresponding to each node, and create a target index corresponding to each node. After the creation is completed, The target index is returned to the corresponding node. In this way, the number of primary and secondary slices can be set according to the number of deployed nodes in the cluster, which effectively reduces the waste of hardware resources caused by the excessive number of secondary slices.
示例性地,当存在N个节点时,当基于服务器集群中节点的数量确定对目标数据进行分片过程中,每一个节点对应的主片和副片数量时,可以依据第二节点的副片数量与第一节点的主片数量相同、第三节点的副片数量与第二节点的主片数量相同,以此类推的原则,确定每一个节点对应的主片和副片数量。其中,主片和副片存储的数据相同,可以防止硬件问题导致数据丢失。Exemplarily, when there are N nodes, when the number of target data is determined based on the number of nodes in the server cluster, the number of primary and secondary pieces corresponding to each node can be based on the secondary piece of the second node The number is the same as the number of main slices of the first node, the number of secondary slices of the third node is the same as the number of main slices of the second node, and so on to determine the number of main slices and auxiliary slices corresponding to each node. Among them, the main film and the secondary film store the same data, which can prevent data loss caused by hardware problems.
本申请实施例中,服务器可以不需要立即更新目标数据大量消耗硬件的性能,有利于兼顾对硬件性能的保护和用户查询到目标数据的及时性。In the embodiment of the present application, the server may not need to update the target data immediately, which consumes a large amount of hardware performance, which is beneficial to both the protection of the hardware performance and the timeliness of the user inquiring about the target data.
本申请实施例还提供了一种数据处理装置。该装置包括用于执行前述图1或者图2所述的方法的模块,配置于服务器。具体地,参见图3,是本申请实施例提供的数据处理装置的示意框图。本实施例的数据处理装置包括:The embodiment of the present application also provides a data processing device. The device includes a module for executing the method described in FIG. 1 or FIG. 2, and is configured on a server. Specifically, refer to FIG. 3, which is a schematic block diagram of a data processing apparatus provided by an embodiment of the present application. The data processing device of this embodiment includes:
通信模块30,用于接收来自客户端的用于存储目标数据的存储请求,所述存储请求中包括所述目标数据;The communication module 30 is configured to receive a storage request for storing target data from a client, where the storage request includes the target data;
处理模块31,用于对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语义信息;The processing module 31 is configured to perform field analysis on the target data to obtain a field analysis result, where the field analysis result includes a field corresponding to the target data and semantic information of the field;
所述处理模块31,还用于检测预设索引存储区域中是否存在所述字段对应的目标索引,若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;The processing module 31 is further configured to detect whether there is a target index corresponding to the field in the preset index storage area, and if it is detected that the target index does not exist in the preset index storage area, based on the semantic information Determine the storage structure type to which the target data belongs;
所述通信模块30,还用于向预连接的搜索服务器发送用于创建目标索引的索引请求,并接收所述搜索服务器返回的所述目标索引,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;The communication module 30 is further configured to send an index request for creating a target index to the pre-connected search server, and receive the target index returned by the search server, and the index request carries the storage structure type and the target index. The target data, so that the search server can create a target index matching the storage structure type for the target data according to the index request;
所述处理模块31,还用于根据所述目标索引存储所述目标数据。The processing module 31 is further configured to store the target data according to the target index.
在一个实施例中,所述处理模块31,具体用于若基于所述语义信息检测到所述目标数据对应的字段用于完整匹配查找,则将所述目标数据所属的存储结构类型确定为关键字类型。In one embodiment, the processing module 31 is specifically configured to determine the storage structure type to which the target data belongs as the key if the field corresponding to the target data is detected based on the semantic information for complete matching search. Word type.
在一个实施例中,所述处理模块31,具体还用于若基于所述语义信息检测到所述目标数据对应的字段用于模糊匹配查找,则将所述目标数据所属的存储结构类型确定为分词类型。In one embodiment, the processing module 31 is specifically further configured to determine the storage structure type to which the target data belongs if it is detected based on the semantic information that the field corresponding to the target data is used for fuzzy matching search Word segmentation type.
在一个实施例中,所述通信模块30,还用于在所述目标索引中存储所述目标数据之后,向所述客户端发送针对所述目标数据的更新指示信息,所述更新指示信息用于指示客户端按照预设更新策略更新所述目标数据。In one embodiment, the communication module 30 is further configured to send update instruction information for the target data to the client after storing the target data in the target index, and the update instruction information is used Instruct the client to update the target data according to a preset update strategy.
在一个实施例中,所述预设更新策略包括延迟更新策略或者时间更新策略,其中,所述延迟更新策略用于指示所述客户端在检测到针对目标数据的触发操作时,更新所述目标数据;所述时间更新策略,用于指示所述客户端在预设时间后更新所述目标数据。In one embodiment, the preset update strategy includes a delayed update strategy or a time update strategy, wherein the delayed update strategy is used to instruct the client to update the target when a trigger operation for target data is detected. Data; the time update strategy is used to instruct the client to update the target data after a preset time.
在一个实施例中,所述处理模块31,还用于向预连接的搜索服务器发送用于创建目标索引的索引请求之后,若在预设时间内未接收到所述搜索服务器返回的所述目标索引时,生成一个预设索引,并根据所述预设索引存储所述目标数据。In one embodiment, the processing module 31 is further configured to send an index request for creating a target index to a pre-connected search server, if the target index returned by the search server is not received within a preset time When indexing, a preset index is generated, and the target data is stored according to the preset index.
在一个实施例中,所述处理模块31,还用于生成一个预设索引之后,创建用于接收所述搜索服务器返回的目标索引的异步线程;若通过所述异步线程接收到所述搜索服务器返回所述目标索引,则通过所述目标索引更新所述预设索引,并根据所述目标索引存储所述目标数据。In one embodiment, the processing module 31 is further configured to create an asynchronous thread for receiving the target index returned by the search server after generating a preset index; if the search server is received through the asynchronous thread When returning to the target index, the preset index is updated through the target index, and the target data is stored according to the target index.
需要说明的是,本申请实施例所描述的数据处理装置的各功能模块的功能可根据图1或者图2所述的方法实施例中的方法具体实现,其具体实现过程可以参照图1或者图2的方法实施例的相关描述,此处不再赘述。It should be noted that the functions of each functional module of the data processing device described in the embodiment of the present application can be specifically implemented according to the method in the method embodiment described in FIG. 1 or FIG. 2. For the specific implementation process, refer to FIG. 1 or FIG. The related description of the method embodiment of 2 will not be repeated here.
本申请实施例中,当通信模块30接收到来自客户端的用于存储目标数据的存储请求时,处理模块31对目标数据进行字段解析,得到目标数据对应的字段以及字段的语义信息,并检测预设索引存储区域中是否存在字段对应的目标索引,若检测到预设索引存储区域中不存在目标索引,则基于语义信息确定目标数据所属的存储结构类型,并通过通信模块30向预连接的搜索服务器发送用于创建目标索引的索引请求,进而接收搜索服务器返回的目标索引,通过处理模块31根据目标索引存储目标数据。采用本申请实施例,可以根据目标数据所属的存储结构类型,创建目标索引,防止对不必要字段的拆分,有利于提高创建索引的效率。In the embodiment of the present application, when the communication module 30 receives a storage request for storing target data from the client, the processing module 31 performs field analysis on the target data, obtains the field corresponding to the target data and the semantic information of the field, and detects the prediction Assuming whether there is a target index corresponding to the field in the index storage area, if it is detected that the target index does not exist in the preset index storage area, the storage structure type to which the target data belongs is determined based on the semantic information, and the pre-connected search is performed through the communication module 30 The server sends an index request for creating a target index, and then receives the target index returned by the search server, and stores the target data according to the target index through the processing module 31. With the embodiment of the present application, the target index can be created according to the storage structure type to which the target data belongs to prevent the splitting of unnecessary fields, which is beneficial to improve the efficiency of index creation.
请参见图4,图4是本申请实施例提供的一种服务器的示意性框图,如图4所示,该服务器包括,处理器401、存储器402和网络接口403。上述处理器401、存储器402和网络接口403可通过总线或其他方式连接,在本申请实施例所示图4中以通过总线连接为例。其中,网络接口403受所述处理器的控制用于收发消息,存储器402用于存储计算机程序,所述计算机程序包括程序指令,处理器401用于执行存储器402存储的程序指令。其中,处理器401被配置用于调用所述程序指令执行:当通过网络接口403接收到来自客户端的用于存储目标数据的存储请求时,对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语 义信息;检测预设索引存储区域中是否存在所述字段对应的目标索引;若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;通过网络接口403向预连接的搜索服务器发送用于创建目标索引的索引请求,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;通过网络接口403接收所述搜索服务器返回的所述目标索引,并根据所述目标索引存储所述目标数据。Please refer to FIG. 4. FIG. 4 is a schematic block diagram of a server according to an embodiment of the present application. As shown in FIG. 4, the server includes a processor 401, a memory 402, and a network interface 403. The processor 401, the memory 402, and the network interface 403 may be connected by a bus or in other ways. In FIG. 4 shown in the embodiment of the present application, the connection by a bus is taken as an example. The network interface 403 is controlled by the processor to send and receive messages, the memory 402 is used to store a computer program, and the computer program includes program instructions, and the processor 401 is used to execute the program instructions stored in the memory 402. The processor 401 is configured to call the program instructions to execute: when receiving a storage request for storing target data from a client through the network interface 403, perform field analysis on the target data to obtain a field analysis result , The field analysis result includes the field corresponding to the target data and the semantic information of the field; detecting whether the target index corresponding to the field exists in the preset index storage area; if it is detected in the preset index storage area If the target index does not exist, the storage structure type to which the target data belongs is determined based on the semantic information; an index request for creating a target index is sent to the pre-connected search server through the network interface 403, and the index request carries all The storage structure type and the target data, so that the search server can create a target index matching the storage structure type for the target data according to the index request; and receive the return from the search server through the network interface 403 The target index, and the target data is stored according to the target index.
在一个实施例中,所述处理器401,具体用于若基于所述语义信息检测到所述目标数据对应的字段用于完整匹配查找,则将所述目标数据所属的存储结构类型确定为关键字类型。In one embodiment, the processor 401 is specifically configured to determine that the storage structure type to which the target data belongs is the key if it is detected based on the semantic information that the field corresponding to the target data is used for complete matching search. Word type.
在一个实施例中,所述处理器401,具体还用于若基于所述语义信息检测到所述目标数据对应的字段用于模糊匹配查找,则将所述目标数据所属的存储结构类型确定为分词类型。In one embodiment, the processor 401 is specifically further configured to determine that the storage structure type to which the target data belongs is if it is detected based on the semantic information that the field corresponding to the target data is used for fuzzy matching search Word segmentation type.
在一个实施例中,所述网络接口403,还用于在所述目标索引中存储所述目标数据之后,向所述客户端发送针对所述目标数据的更新指示信息,所述更新指示信息用于指示客户端按照预设更新策略更新所述目标数据。In one embodiment, the network interface 403 is further configured to send update instruction information for the target data to the client after storing the target data in the target index, and the update instruction information is used Instruct the client to update the target data according to a preset update strategy.
在一个实施例中,所述预设更新策略包括延迟更新策略或者时间更新策略,其中,所述延迟更新策略用于指示所述客户端在检测到针对目标数据的触发操作时,更新所述目标数据;所述时间更新策略,用于指示所述客户端在预设时间后更新所述目标数据。In one embodiment, the preset update strategy includes a delayed update strategy or a time update strategy, wherein the delayed update strategy is used to instruct the client to update the target when a trigger operation for target data is detected. Data; the time update strategy is used to instruct the client to update the target data after a preset time.
在一个实施例中,所述处理器401,还用于向预连接的搜索服务器发送用于创建目标索引的索引请求之后,若在预设时间内未接收到所述搜索服务器返回的所述目标索引时,生成一个预设索引,并根据所述预设索引存储所述目标数据。In one embodiment, the processor 401 is further configured to send an index request for creating a target index to a pre-connected search server, if the target index returned by the search server is not received within a preset time When indexing, a preset index is generated, and the target data is stored according to the preset index.
在一个实施例中,所述处理器401,还用于生成一个预设索引之后,创建用于接收所述搜索服务器返回的目标索引的异步线程;若通过所述异步线程接收到所述搜索服务器返回所述目标索引,则通过所述目标索引更新所述预设索引,并根据所述目标索引存储所述目标数据。In an embodiment, the processor 401 is further configured to create an asynchronous thread for receiving the target index returned by the search server after generating a preset index; if the search server is received through the asynchronous thread When returning to the target index, the preset index is updated through the target index, and the target data is stored according to the target index.
应当理解,在本申请实施例中,所称处理器401可以是中央处理单元 (Central Processing Unit,CPU),该处理器401还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that in the embodiment of the present application, the processor 401 may be a central processing unit (Central Processing Unit, CPU), and the processor 401 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs). ), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
该存储器402可以包括只读存储器和随机存取存储器,并向处理器401提供指令和数据。存储器402的一部分还可以包括非易失性随机存取存储器。例如,存储器402还可以存储设备类型的信息。The memory 402 may include a read-only memory and a random access memory, and provides instructions and data to the processor 401. A part of the memory 402 may also include a non-volatile random access memory. For example, the memory 402 may also store device type information.
具体实现中,本申请实施例中所描述的处理器401、存储器402和网络接口403可执行本申请实施例提供的图1或者图2所述的方法实施例所描述的实现方式,也可执行本申请实施例所描述的数据处理装置的实现方式,在此不再赘述。In specific implementation, the processor 401, memory 402, and network interface 403 described in the embodiment of the present application can execute the implementation described in the method embodiment described in FIG. 1 or FIG. The implementation of the data processing device described in the embodiment of the present application will not be repeated here.
在本申请的另一实施例中提供一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令被处理器执行时实现:当接收到来自客户端的用于存储目标数据的存储请求时,对所述目标数据进行字段解析,以得到字段解析结果,所述字段解析结果包括所述目标数据对应的字段以及所述字段的语义信息;检测预设索引存储区域中是否存在所述字段对应的目标索引;若检测到所述预设索引存储区域中不存在所述目标索引,则基于所述语义信息确定所述目标数据所属的存储结构类型;向预连接的搜索服务器发送用于创建目标索引的索引请求,所述索引请求携带所述存储结构类型以及所述目标数据,以便于所述搜索服务器根据所述索引请求,为所述目标数据创建与所述存储结构类型匹配的目标索引;接收所述搜索服务器返回的所述目标索引,并根据所述目标索引存储所述目标数据。In another embodiment of the present application, a computer non-volatile readable storage medium is provided. The computer non-volatile readable storage medium stores a computer program. The computer program includes program instructions. When executed by the processor, it is realized: when a storage request for storing target data is received from a client, the target data is subjected to field analysis to obtain a field analysis result, and the field analysis result includes the data corresponding to the target data. Field and semantic information of the field; detect whether there is a target index corresponding to the field in the preset index storage area; if it is detected that the target index does not exist in the preset index storage area, based on the semantic information Determine the storage structure type to which the target data belongs; send an index request for creating a target index to a pre-connected search server, where the index request carries the storage structure type and the target data, so that the search server can follow The index request creates a target index matching the storage structure type for the target data; receives the target index returned by the search server, and stores the target data according to the target index.
所述计算机非易失性可读存储介质可以是前述任一实施例所述的服务器的内部存储单元,例如服务器的硬盘或内存。所述计算机非易失性可读存储介质也可以是所述服务器的外部存储设备,例如所述服务器上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡, 闪存卡(Flash Card)等。进一步地,所述计算机非易失性可读存储介质还可以既包括所述服务器的内部存储单元也包括外部存储设备。所述计算机非易失性可读存储介质用于存储所述计算机程序以及所述服务器所需的其他程序和数据。所述计算机非易失性可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。The computer non-volatile readable storage medium may be the internal storage unit of the server described in any of the foregoing embodiments, such as the hard disk or memory of the server. The computer non-volatile readable storage medium may also be an external storage device of the server, such as a plug-in hard disk equipped on the server, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital) , SD) card, flash card (Flash Card), etc. Further, the computer non-volatile readable storage medium may also include both an internal storage unit of the server and an external storage device. The computer non-volatile readable storage medium is used to store the computer program and other programs and data required by the server. The computer non-volatile readable storage medium can also be used to temporarily store data that has been output or will be output.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments. Wherein, the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
以上所揭露的仅为本申请的部分实施例而已,当然不能以此来限定本申请之权利范围,本领域普通技术人员可以理解实现上述实施例的全部或部分流程,并依本申请权利要求所作的等同变化,仍属于发明所涵盖的范围。The above-disclosed are only part of the embodiments of this application. Of course, it cannot be used to limit the scope of rights of this application. Those of ordinary skill in the art can understand all or part of the process for implementing the above-mentioned embodiments and make them in accordance with the claims of this application. The equivalent changes still fall within the scope of the invention.