WO2021068549A1

WO2021068549A1 - Data processing method, platform and system

Info

Publication number: WO2021068549A1
Application number: PCT/CN2020/096999
Authority: WO
Inventors: 万鹏程; 吕勇; 李春生; 贾洪园
Original assignee: 苏宁易购集团股份有限公司; 苏宁云计算有限公司
Priority date: 2019-10-10
Filing date: 2020-06-19
Publication date: 2021-04-15
Also published as: CA3154438A1; CN110837520A

Abstract

A data processing method, platform and system. The method comprises: storing original commodity content data in a first relational database by cluster, by library and by table (S51); establishing index data according to the original commodity content data and storing the index data in an index database (S52), the index data comprising keyword fields and query dimension identification data corresponding to each keyword field; and computing the original commodity content data by means of a computing program to obtain computing result data, and associatively storing the computing result data and the query dimension identification data in the first relational database (S53). The computing efficiency is improved, and an index database is established according to query dimensions to perform indexing in advance during a subsequent query, which inevitably improves the querying efficiency.

Description

Data processing method, platform and system

Technical field

This application relates to the field of business data calculation and query, and in particular to a data processing method, platform and system.

Background technique

When merchants sell goods, they often need to use some analytical data as a basis for guidance for operations. These analysis data are mostly based on the platform's analysis and calculation of a large number of commodity content data. For example, the product content quality score used to characterize the quality of product description information can provide merchandise operation guidance for merchants selling physical merchandise. This data is obtained through the platform's summary analysis and calculation of numerous product content data of many merchants. At present, the summary analysis and calculation of many product content data are mostly realized through Java and the relational database Mysql. When merchants need to query these calculation result data, they will directly query Mysql.

However, in the context of the rapid development of e-commerce, a large amount of product content data is generated, especially during the platform promotion period, such as "Double Eleven", "618", "818", "Double 12", etc. The amount of data is even greater. Substantial growth. The method of Java and the relational database Mysql has low efficiency in data calculation. When the merchant performs the calculation result data query, the method of Java and the relational database Mysql also leads to low query efficiency. Especially when encountering some complex query conditions, the query time is basically in the second level.

Summary of the invention

This application provides a data processing method, platform, and system to solve the problem of low efficiency in calculating and querying product content data in the prior art.

This application provides the following solutions:

In one aspect, a data processing method is provided, and the method includes:

Store the original product content data in clusters, databases and tables in the first relational database;

Create index data according to the original product content data and store it in an index database; the index data includes keyword fields and query dimension identification data corresponding to each keyword field;

Calling a calculation program to calculate the original product content data to obtain calculation result data and store the calculation result data in association with the query dimension identification data in the first relational database.

Preferably, the method further includes:

Receive user's query request;

Parse the query request to obtain keywords to be queried;

Query in the index database to obtain query dimension identification data corresponding to the keyword to be queried as a target identification;

Query in the first relational database to obtain calculation result data corresponding to the target identifier.

Preferably, the method further includes:

At least part of the calculation result data is associated with the query dimension identification data and stored in the index database.

Preferably,

The calculation result data obtained by invoking the calculation program to calculate the original product content data includes:

Invoking a calculation program to calculate the content quality score of each dimension of each product in at least two content dimensions for the original product content data, and calculate the total content quality score of each product according to the scores of each dimension;

The storing the calculation result data in association with the query dimension identification data in the first relational database includes:

Storing the content quality score of each dimension of each commodity and the total content quality score of each commodity in association with the query dimension identification data in the first relational database;

The storing at least part of the calculation result data in association with the query dimension identification data in the index database includes:

The total quality score of each commodity is associated with the query dimension identification data and stored in the index database.

Preferably, the query dimension identification data is a product code and/or a merchant code.

Preferably, the method further includes:

Receiving the original product content data and storing it in a second relational database in clusters, databases and tables;

Synchronize the original product content data in the second relational database to the first relational database.

Preferably, the receiving the original product content data and storing it in a second relational database in clusters, databases and tables includes:

The original product content data is received and stored in the second relational database according to the product code, cluster and database and table.

Preferably,

The first relational database is Hbase, the second relational database is Mysql, the calculation program is Spark, and the index database is Elasticsearch.

Another aspect of the present application also provides a data processing platform, which includes a data storage layer and a data calculation layer;

The data storage layer is used to store the original product content data in clusters, databases and tables in a first relational database, and build index data based on the original product content data and store it in the index database; the index data includes keywords The query dimension identification data corresponding to the field and each keyword field;

The data calculation layer is used to call a calculation program to calculate the original product content data to obtain calculation result data and store the calculation result data in association with the query dimension identification data in the first relational database.

In yet another aspect of this application, a computer system is also provided, including:

One or more processors; and

A memory associated with the one or more processors, where the memory is used to store program instructions, and when the program instructions are read and executed by the one or more processors, perform the following operations:

A calculation program is called to calculate the original product content data to obtain calculation result data, and the calculation result data is associated with the query dimension identification data and stored in the first relational database.

According to the specific embodiments provided in this application, this application discloses the following technical effects:

The technical solution of this application improves the calculation efficiency by storing the original commodity data in clusters, databases, and tables in a relational database, calling calculation programs for calculation, and establishing an index database according to the query dimensions, and then indexing is performed in the subsequent query. Will improve query efficiency. Compared with the prior art, this solution can quickly provide multi-dimensional query of calculation result data, and avoids the problem of low efficiency caused by direct query in a relational database.

Of course, implementing any product of the present application does not necessarily need to achieve all the advantages described above at the same time.

Description of the drawings

In order to explain the embodiments of the present application or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the embodiments. Obviously, the drawings in the following description are only some of the present application. Embodiments, for those of ordinary skill in the art, without creative work, other drawings can be obtained based on these drawings.

Fig. 1 is a structural diagram of a data processing platform provided by an embodiment of the present application;

Figure 2 is a schematic diagram of a cluster database provided by an embodiment of the present application;

FIG. 3 is a flowchart of original product content data synchronization provided by an embodiment of the present application;

FIG. 4 is a flow chart of product content quality sub-query according to an embodiment of the present application;

Fig. 5 is a flowchart of a data processing method provided by an embodiment of the present application;

Fig. 6 is an architecture diagram of a computer system provided by an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art fall within the protection scope of this application.

This application aims to provide a method for processing product content data. The original product content data is stored in a relational database in clusters, databases, and tables, and then the calculation program is called to perform parallel calculations in a relational database to improve calculation efficiency, and according to the required query dimensions Index data is established so that subsequent queries can be performed in the relational database after matching the identification data in the index database. This solution can quickly provide multi-dimensional queries of the calculation result data and improve the query efficiency.

As shown in Fig. 1, the data processing platform structure diagram of one embodiment of this application includes Mysql database, Hbase database, Spark computing program for calculation, search engine Elasticsearch, remote service framework RSF, and merchants that perform queries.

Among them, the Mysql database is used as a database for receiving original product content data, and it stores a large amount of original product content data in its own database by clustering and sub-database sub-table. Specifically, it can be divided into clusters and tables according to the way of commodity coding, and the specific operations will be described in detail later.

The Hbase database is used for synchronization based on the data in the Mysql database. Specifically, synchronization can be accomplished through the data replication and data exchange platform. After synchronization, the Hbase database stores the original product content data in a manner of clustering, database, and table.

In other embodiments of the present application, the original product content data can be directly stored in the Hbase database without going through the Mysql database. And the way to go through the Mysql database is to take into account the stability of data backup, and the other is to take into account that other business processes need to rely on the Mysql database for operations.

Calculate the result data to be queried later, establish an index, and establish an association relationship between the index and the result data obtained by the calculation. So that the result data can be further queried according to the index data:

The index database Elasticsearch stores keyword fields for query, such as product brands, and identification data corresponding to the keyword fields, such as product codes. Based on the index, the query keywords entered by the user (merchant) can be matched to the corresponding product code.

The Spark calculation program is used to perform MapReduce (programming model, parallel operation for large-scale data sets (greater than 1TB)) on the original product content data of each cluster according to the expression rules according to the number segment of the product code to obtain the calculation result , Such as calculating the quality score of the product content. After the calculation result is obtained, the calculation result and identification data such as the product code are stored in the Hbase database.

After the above steps, the association between the index data in Elasticsearch and the calculation result data in the Hbase database is established through the identification data.

When a user enters a query keyword, RSF first searches in the index to determine matching identification data such as a product code, and then determines the calculation result data in the Hbase database according to the product code.

The establishment of the aforementioned index can be independent of the establishment of the calculation process. Of course, in this application, at least a part of the calculation result can also be stored in the index database. When the query of this part of the result is performed, it can be completed only through Elasticsearch, without further querying in the Hbase database.

It should be noted that the aforementioned Mysql database, Hbase database, Spark calculation program, and search engine Elasticsearch can all be replaced by modules with similar functions. Figure 1 is only a specific system structure of this application.

Taking the system shown in Figure 1 and calculating the quality of product content as an example, the original product content data involved is divided into clusters, databases, tables, storage process, product content data synchronization process, product content quality calculation process, product content quality synchronization process The process, indexing process and product content quality sub-query process are introduced in detail:

The original product content data is stored in clusters, databases, and tables:

The original product content data is stored in 4 clusters of Mysql according to the number segment of the product code, and the result of modulo 10 according to the last two digits of the product code is stored in the 10 sub-databases of each cluster, according to the product code The last digit is taken out of 10 and stored in the 10 sub-tables of each sub-library, so that more than one billion product content data is scattered into hundreds of sub-tables. Figure 2 shows a schematic diagram of a cluster sub-library.

For example, define the number segment of the product code stored in each cluster: the product data of the segment from 000000000000000000 to 000000000500000000 is stored in cluster 1; the product data of the segment from 000000000500000001 to 000000001000000000 is stored in cluster 2; the product data of the segment from 000000001000000001 to 000000001500000000 is stored in 3 clusters; the commodity data from 000000001500000001 to 000000002000000000 are stored in 4 clusters.

Define the sub-library of the cluster to which each product belongs: specify the corresponding sub-library to store the result of modulo 10 based on the last two bits of the product code.

Define the sub-table of the sub-library to which each product belongs: According to the last digit of the product code, specify the corresponding sub-table to store the result of the remainder of 10.

For example, the commodity code 000000001500000023 belongs to the 4 clusters, 3 sub-bases and 4 sub-tables.

Original product content data synchronization:

There are three types of original product content data synchronization: quasi-real-time incremental update, daily incremental update, and weekly full update. The daily incremental update and weekly full update are for fault tolerance.

As shown in Figure 3, specifically, you can define a real-time data replication platform RDRS (RealTime Data Replication System), which is used to synchronize Mysql data to HBase in quasi-real time and define the data exchange platform IDE, which is used for daily incremental and weekly full transfer Mysql data is synchronized to HBase:

The RDRS platform synchronizes product content data to HBase by analyzing the binlog information of the Mysql database cluster in quasi-real time.

The data exchange platform synchronizes the incremental data of product content information to HBase every day, and compares and corrects it with the quasi-real-time HBase product content data.

The data exchange platform synchronizes the full amount of product data to HBase every week, and compares and corrects it with the current HBase product content data.

Product content quality score calculation:

The quality of product content is mainly affected by seven content dimensions: basic information, parameter information, category information, main image information, title information, selling point information, and detailed information. The Spark program will calculate the basic information, parameter information, category information, main image information, title information, selling point information, detailed information scores, and finally for all sub-library products based on the expression rules in parallel for each sub-library. The dimensional scores are summarized and written into Hive (a data warehouse tool of Hadoop), specifically:

First, use MapReduce to calculate the basic information, parameter information, category information, main image information, title information, selling point information, and detailed information scores of all sub-libraries according to the sub-libraries. Calculating according to the sub-database is mainly to reduce the excessive tilt of the data, thereby improving the calculation efficiency.

Combine the scores of basic information, parameter information, category information, main image information, title information, selling point information, and detailed information scores to get the total score.

The following is a test of the calculation efficiency of this application and the prior art:

Insert 100w pieces of data to be calculated, 1000w pieces of data to be calculated, and 100 million pieces of data to be calculated into the to-be-calculated table of product quality evaluation. Then calculate based on java+Mysql and Spark+HBase respectively. The test results are recorded in Table 1.

Table 1. Comparison of computing efficiency between Spark+HBase and java

表中记录数Number of records in the table	Spark+HBaseSpark+HBase	Java+MysqJava+Mysq
100w100w	30分钟30 minutes	8小时8 hours
1000w1000w	2小时2 hours	3天3 days
1亿100000000	5小时5 hours	30天30 days

Through the test results, it can be seen that the calculation based on the combination of Spark+HBase will greatly improve the calculation efficiency. Even if the number of data items doubles, the calculation efficiency still has an excellent performance.

Product content quality sub-data synchronization:

According to the set query dimensions such as products and merchants, the corresponding total scores are calculated by summarizing the scores, such as the total score of a certain product or the total score of a certain merchant. Of course, other dimensions are also possible. After that, data such as the product content quality score of each dimension, the total product content quality score, and the score aggregated according to the set query dimensions are synchronized to HBase.

Query dimension index establishment:

Establish index data according to query dimensions such as product code and merchant code. The index data includes keyword fields and corresponding query dimension identification data. Such as product brand and corresponding product code.

The index can be established based on the synchronization process of the product content quality score data. When the product content quality score data is calculated and synchronized to HBase, the corresponding relationship between the keyword field in the original product data and the query dimension identification data is established, and the corresponding relationship is established according to the The total score data obtained by querying dimensions such as product code and merchant code is synchronized to the index data.

Among them, the relevant calculation result data of Elasticsearch and HBase, such as product content quality score data, are incremental updates.

Product content quality sub-query.

According to the data of different types of query conditions required by users, corresponding query interfaces and request parameters are required. Then, according to the query conditions, it will first go to Elasticsearch to obtain the corresponding product code and merchant code, and then according to the queried product code and merchant code Go to HBase to query the required data, and finally integrate and filter the eligible data and return it to the user. Specifically:

First define the remote service framework RSF (Remote Service Framework), which is used to provide remote query services to the querier component and define query services (QueryService), which are used to process merchant queries. According to the query conditions entered by the merchant, it calls the RSF service to perform various types of iterative queries, and then intersects the results of each sub-query condition, where each sub-query is a concurrent query.

Figure 4 is a flowchart of the product content quality sub-query process, including the following steps:

The client sends a query service request for product quality score;

The query server performs expression analysis on the product quality sub-query service request sent by the client;

The query server submits the parsed query request to an Elasticsearch cluster (Elasticsearch Cluster); in this embodiment, Elasticsearch sets up the cluster to prevent a single point of failure of the machine.

The Elasticsearch cluster returns the query result (product code + merchant code) to the query server;

The query server submits a query request to the HBase Cluster (HBase Cluster) according to the query result returned by the Elasticsearch cluster;

The HBase cluster returns the final query result corresponding to the product code and merchant code to the query server;

The query server returns the final query result to the client.

The following is a test of the efficiency of this application and the prior art query:

Insert 100w pieces of data to be calculated, 1000w pieces of data to be calculated, and 100 million pieces of data to be calculated into the to-be-calculated table of product quality evaluation. Then calculate based on Java+Mysql and Spark+HBase respectively.

Insert 1 million data, 10 million data, 100 million data, and 1 billion data into different tables of Elasticsearch and HBase, with 15 fields in each record. Then query based on java+Mysql and Elasticsearch+HBase respectively.

The test results are recorded in Table 2.

Table 2. Comparison of query efficiency between Elasticsearch+HBase and Java+Mysql

表中记录数Number of records in the table	Elasticsearch+HBaseElasticsearch+HBase	Java+MysqlJava+Mysql
100w100w	125ms125ms	0.564s0.564s
1000w1000w	140ms140ms	2.543s2.543s
1亿100000000	162ms162ms	超时异常Timeout exception
10亿1000000000	190ms190ms	超时异常Timeout exception

Through the test results, it can be seen that the query based on the combination of Elasticsearch+HBase will greatly improve the query efficiency. Even if the number of data items doubles, the query efficiency still has an excellent performance.

Example one

As mentioned earlier, the aforementioned databases or calculation programs Spark can be replaced by similar functional modules, and the calculation results can also be set as other data other than the product content quality score according to user needs. Based on this, the first embodiment of the present application provides a data processing method, as shown in FIG. 5, including the following steps:

S51. Store the original product content data in clusters, databases and tables in the first relational database;

S52. Establish index data according to the original product content data and store it in an index database; the index data includes keyword fields and query dimension identification data corresponding to each keyword field;

S53: Invoke a calculation program to calculate the original product content data to obtain calculation result data, and store the calculation result data in association with the query dimension identification data in the first relational database.

Preferably,

The method also includes:

Receive user's query request;

Parse the query request to obtain keywords to be queried;

In addition, the method may further include:

In another preferred embodiment, the method further includes: receiving the original product content data and storing it in a second relational database in clusters, databases, and tables; specifically, clusters and tables are classified according to the product code;

Example two

Corresponding to the above method, this application also provides a data processing platform, which includes a data storage layer and a data calculation layer;

The data storage layer is used to divide the original product content data into clusters, databases and tables to store the first relational database, and build index data according to the original product content data and store it in the index database; the index data includes keywords The query dimension identification data corresponding to the field and each keyword field;

In a preferred embodiment, the data processing platform further includes a data application layer, which is used to receive a user's query request, analyze to obtain the keyword to be queried, and perform a query in the index database to obtain a query corresponding to the keyword to be queried The dimension identification data is used as a target identification and is queried in the first relational database to obtain calculation result data corresponding to the target identification so as to return the result data to the user.

In a preferred embodiment, the storage layer is further configured to store at least part of the calculation result data in association with the query dimension identification data in the index database.

In a preferred embodiment, the storage layer is further configured to receive the original product content data and store it in a second relational database in clusters, databases and tables, and synchronize the original product content data in the second relational database. To the first relational database.

Example three

Corresponding to the foregoing method and platform, Embodiment 3 of the present application provides a computer system, including:

One or more processors; and

The original product content data is stored in the first relational database in clusters, databases and tables;

The original product content data is calculated by a calculation program to obtain calculation result data, and the calculation result data is associated with the query dimension identification data and stored in the first relational database.

Among them, FIG. 6 exemplarily shows the architecture of the computer system, which may specifically include a processor 1510, a video display adapter 1511, a disk drive 1512, an input/output interface 1513, a network interface 1514, and a memory 1520. The processor 1510, the video display adapter 1511, the disk drive 1512, the input/output interface 1513, the network interface 1514, and the memory 1520 may be communicatively connected through the communication bus 1530.

Among them, the processor 1510 may be implemented by a general-purpose CPU (Central ProcElasticsearchsing Unit, central processing unit), microprocessor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits. Perform relevant procedures to realize the technical solutions provided in this application.

The memory 1520 may be implemented in the form of ROM (Read Only Memory), RAM (Random AccElasticsearch Memory, random access memory), static storage device, dynamic storage device, etc. The memory 1520 may store an operating system 1521 used to control the operation of the computer system 1500, and a basic input output system (BIOS) used to control low-level operations of the computer system 1500. In addition, a web browser 1523, a data storage management system 1524, and an icon font processing system 1525 can also be stored. The foregoing icon font processing system 1525 may be an application program that specifically implements the foregoing steps in the embodiment of the present application. In short, when the technical solution provided by the present application is implemented through software or firmware, the related program code is stored in the memory 1520, and is called and executed by the processor 1510.

The input/output interface 1513 is used to connect input/output modules to realize information input and output. The input/output/module can be configured in the device as a component (not shown in the figure), or it can be connected to the device to provide corresponding functions. The input device may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and an output device may include a display, a speaker, a vibrator, and an indicator light.

The network interface 1514 is used to connect a communication module (not shown in the figure) to realize the communication interaction between the device and other devices. The communication module can realize communication through wired means (such as USB, network cable, etc.), or through wireless means (such as mobile network, WIFI, Bluetooth, etc.).

The bus 1530 includes a path to transmit information between various components of the device (for example, the processor 1510, the video display adapter 1511, the disk drive 1512, the input/output interface 1513, the network interface 1514, and the memory 1520).

In addition, the computer system 1500 can also obtain information about specific receiving conditions from the virtual resource object receiving condition information database 1541 for condition determination, and so on.

It should be noted that although the above device only shows the processor 1510, the video display adapter 1511, the disk drive 1512, the input/output interface 1513, the network interface 1514, the memory 1520, the bus 1530, etc., in the specific implementation process, the The equipment may also include other components necessary for normal operation. In addition, those skilled in the art can understand that the above-mentioned device may also include only the components necessary to implement the solution of the present application, and not necessarily include all the components shown in the figure.

From the description of the foregoing implementation manners, it can be known that those skilled in the art can clearly understand that this application can be implemented by means of software plus a necessary general hardware platform. Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product can be stored in a storage medium, such as ROM/RAM, magnetic disk , CD-ROM, etc., including a number of instructions to enable a computer device (which may be a personal computer, a cloud server, or a network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments of the present application.

The various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the system or system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the part of the description of the method embodiment. The system and system embodiments described above are merely illustrative, where the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, It can be located in one place, or it can be distributed to multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments. Those of ordinary skill in the art can understand and implement it without creative work.

The data processing methods, platforms, and systems provided by the application are described in detail above. Specific examples are used in this article to illustrate the principles and implementation of the application. The descriptions of the above examples are only used to help understand the application. The method and its core idea; meanwhile, for those of ordinary skill in the art, according to the idea of this application, there will be changes in the specific implementation and the scope of application. In summary, the content of this specification should not be construed as a limitation on this application.

Claims

A data processing method, characterized in that the method includes:

The original product content data is stored in the first relational database in clusters, databases and tables;

Create index data according to the original product content data and store it in an index database; the index data includes keyword fields and query dimension identification data corresponding to each keyword field;

The original product content data is calculated by a calculation program to obtain calculation result data, and the calculation result data is associated with the query dimension identification data and stored in the first relational database.
The data processing method according to claim 1, wherein the method further comprises:

Receive user's query request;

Parse the query request to obtain keywords to be queried;

Query in the index database to obtain query dimension identification data corresponding to the keyword to be queried as a target identification;

Query in the first relational database to obtain calculation result data corresponding to the target identifier.
The data processing method according to claim 1, wherein the method further comprises:

At least part of the calculation result data is associated with the query dimension identification data and stored in the index database.
The data processing method according to claim 3, wherein:

The calculation result data obtained by invoking the calculation program to calculate the original product content data includes:

Invoking the calculation program to calculate the content quality score of each dimension of each product in at least two content dimensions for the original product content data, and calculate the total content quality score of each product according to the content quality score of each dimension;

The storing the calculation result data in association with the query dimension identification data in the first relational database includes:

Storing the content quality score of each dimension of each commodity and the total content quality score of each commodity in association with the identification data in the first relational database;

The storing at least part of the calculation result data in association with the query dimension identification data in the index database includes:

The content quality total score of each commodity is associated with the query dimension identification data and stored in the index database.
The data processing method according to any one of claims 1 to 4, wherein the identification data is a commodity code and/or a merchant code.
The data processing method according to any one of claims 1 to 4, wherein the method further comprises:

Receiving the original product content data and storing it in a second relational database in clusters, databases and tables;

Synchronize the original product content data in the second relational database to the first relational database.
7. The data processing method of claim 6, wherein the receiving the original product content data and storing it in a second relational database in clusters, databases, and tables comprises:

The original product content data is received and stored in the second relational database according to the product code, cluster and database and table.
The data processing method according to claim 6, wherein:

The first relational database is Hbase, the second relational database is Mysql, the calculation program is Spark, and the index database is Elasticsearch.
A data processing platform, characterized in that the platform includes a data storage layer and a data calculation layer;

The data storage layer is used to store the original product content data in clusters, databases and tables in a first relational database, and build index data based on the original product content data and store it in the index database; the index data includes keywords The query dimension identification data corresponding to the field and each keyword field;

The data calculation layer is used to call a calculation program to calculate the original product content data to obtain calculation result data and store the calculation result data in association with the query dimension identification data in the first relational database.
A computer system, characterized in that it comprises:

One or more processors; and

A memory associated with the one or more processors, where the memory is used to store program instructions, and when the program instructions are read and executed by the one or more processors, perform the following operations:

Store the original product content data in clusters, databases and tables in the first relational database;

Create index data according to the original product content data and store it in an index database; the index data includes keyword fields and identification data corresponding to each keyword field;

Calling a calculation program to calculate the original product content data to obtain calculation result data, and store the calculation result data in association with the identification data in the first relational database.