WO2022083332A1

WO2022083332A1 - Commodity data management method and apparatus, and server

Info

Publication number: WO2022083332A1
Application number: PCT/CN2021/116999
Authority: WO
Inventors: 常亚; 王刚; 胡小清
Original assignee: 华为技术有限公司
Priority date: 2020-10-23
Filing date: 2021-09-07
Publication date: 2022-04-28
Also published as: CN114528343A

Abstract

The present application provides a commodity data management method and apparatus, and a server, which are applicable to the technical field of data management. Said method comprises: acquiring commodity data, and splitting the commodity data into at least one first sub-file, each first sub-file comprising attribute data of at least one commodity; and performing attribute data verification on each first sub-file, and storing the attribute data passing the verification into a database. The embodiments of the present application greatly reduce the technical threshold of commodity data management operations and have higher availability. Furthermore, automatic verification and data storage of the commodity data also greatly increases the management efficiency of commodity data.

Description

Product data management method, device and server

This application claims the priority of the Chinese patent application with the application number 202011152745.1 and the application name "Commodity data management method, device and server" submitted to the State Intellectual Property Office on October 23, 2020, the entire contents of which are incorporated herein by reference Applying.

technical field

The present application belongs to the technical field of data management, and in particular relates to a commodity data management method, device and server.

Background technique

E-commerce has become a mainstream means of mass shopping. Provide commodity data to the commodity management system through a content provider (Content Provider, CP. The CP can be a merchant or someone other than the merchant.) A mode in which users use the commodity management system to search for commodities. It can realize the rapid exposure of the products, and then bring huge traffic to the merchants.

In order to allow users to search for commodities, the existing commodity management system requires the CP to provide structured commodity data. The commodity management system stores these structured commodity data and provides commodity search services. However, this method requires the CP to have a certain data operation capability, which makes the operation threshold for this process relatively high. At the same time, for most CPs, the quantity of goods tends to be larger. In order to provide structured commodity data for these commodities, the CP needs to spend a lot of manpower and material resources to operate, so the workload of commodity data provision is often large. To sum up, the prior art has low management efficiency for commodity data, and the threshold for CP operation is relatively high, which is not conducive to the effective management of commodity data.

SUMMARY OF THE INVENTION

In view of this, the embodiments of the present application provide a commodity data management method, device, and server, which can solve the problem of low efficiency in commodity data management in the prior art.

A first aspect of the embodiments of the present application provides a commodity data management method, applied to a server, including:

The commodity data is acquired, and the commodity data is divided into at least one first sub-file, wherein each first sub-file contains attribute data of at least one commodity.

Perform attribute data verification on each of the first sub-files, and store the verified attribute data in the database.

In the data management process of the embodiment of the present application, the CP only needs to provide the commodity data according to the format requirements, and then the offline import of the commodity data into the database can be realized. Since CP originally needs to sort out commodity data in practical applications (whether for the purpose of inventory sorting or listing on e-commerce platforms, CP generally needs to sort out commodity data in practical applications), so for CP, only the commodity data needs to be sorted out. The data can be organized according to the format requirements without too much extra work. After receiving the commodity data, the commodity management system splits the commodity data to obtain multiple sub-files (ie, the first sub-file). Attribute data verification will be performed on each sub-file, and the attribute data that has passed the verification will be uploaded. The verification of each sub-file by the server may be serial processing or parallel processing. During parallel processing, the server can perform verification operations on multiple sub-files at the same time, thereby improving verification efficiency.

Compared with the prior art, the embodiment of the present application greatly reduces the technical threshold of CP operation, and has higher usability. At the same time, the automatic verification and data storage of commodity data also greatly improves the management efficiency of commodity data.

In a first possible implementation manner of the first aspect, attribute data verification is performed on each of the first sub-files, and the verified attribute data is stored in a database, including:

One sub-file is selected from the at least one first sub-file as the second sub-file.

Perform attribute data verification on the second sub-file, and upload the verified attribute data in the second sub-file to the database.

After the verification of the second sub-file is completed, the operation of selecting a sub-file from the at least one first sub-file as the second sub-file is returned to execute until all the first sub-files are verified.

In the embodiment of the present application, the server will cyclically select and process each subfile (ie, the second subfile) from these first subfiles, so as to perform data verification on each first subtask, and synchronously convert the data in the subfiles The commodity data is stored in the database. This makes warehousing more efficient and realizes efficient management of commodity data.

Based on the first possible implementation manner of the first aspect, as a second possible implementation manner of the first aspect, attribute data verification is performed on the second sub-file, and the verified data in the second sub-file is verified. Before the attribute data is uploaded to the database, it also includes:

A first subtask corresponding to the at least one first subfile one-to-one is created in the database.

Perform attribute data verification on the second sub-file, and upload the verified attribute data in the second sub-file to the database, including:

A second subtask is determined from the first subtasks stored in the database, and a second subfile associated with the second subtask is acquired from at least one first subfile.

Returning and executing the operation of selecting one sub-file from at least one first sub-file as the second sub-file until all the first sub-files are verified, including: returning and executing the operation determined from the first sub-task stored in the database The operation of a second subtask until all the first subtasks have been executed.

In this embodiment of the present application, by creating subtasks (that is, the first subtasks) that correspond to the subfiles one-to-one in the database, and determining the subtasks to be executed (that is, the second subtasks), the Subfile selection. Therefore, in the embodiment of the present application, the database can effectively record the verification status of each sub-file, and at the same time, the server can also conveniently determine the sub-file to be verified each time.

Based on the second possible implementation manner of the first aspect, as a third possible implementation manner of the first aspect, a second subtask is determined from the first subtask stored in the database, including:

The subtasks to be executed in the first subtask are acquired, and the subtasks to be executed include the first subtask that is not executed, and the first subtask that is being executed and whose execution duration exceeds the duration threshold.

A second subtask is determined from the subtasks to be executed.

In this embodiment of the present application, whether the subtask is a subtask to be executed is determined based on the execution state of the subtask. Considering practical applications, on the one hand, unexecuted subtasks need to be processed by the server. On the other hand, in practical applications, there may be cases where the server cannot process subtasks normally, for example, the server cannot process subtasks normally due to reasons such as downtime. At this time, although the subtask is being executed, it cannot be completed. Even if you continue to wait for the server, the processing of the subtasks cannot be completed, and the verification of the subfiles cannot be realized. Therefore these subtasks need to be reprocessed by other servers. Based on the above two considerations. In this embodiment of the present application, unexecuted subtasks and subtasks that are being executed but whose execution time is overdue are regarded as subtasks to be executed. On this basis, the server will obtain all real-time subtasks to be executed, and determine the second subtask to be executed from them.

On the basis of the third possible implementation manner of the first aspect, as the fourth possible implementation manner of the first aspect, a second subtask is determined from the subtasks to be executed, including:

Distributed locks for each to-be-executed subtask are sequentially requested from the cache component.

If a distributed lock for a single subtask to be executed is requested, the subtask to be executed is regarded as the second subtask.

As an optional embodiment of the present application, in order to improve the processing efficiency of subtasks, multiple servers may be used to process each subtask at the same time. At this time, the server that is the execution body of each solution in the first aspect is also a server that processes subtasks. In practical applications, it is found that multiple servers may select the same subtask for processing at the same time. In this case, the processing efficiency of the commodity data is reduced. In order to prevent a single subtask from being repeatedly processed by multiple servers at the same time, in this embodiment of the present application, after receiving the subtask to be executed, the server first selects a subtask from it, and tries to apply for the subtask to the cache component. Distributed lock. Since the distributed lock of a single subtask can only be assigned to a single server. Therefore, if the subtask is not processed by other servers, in theory, the distributed lock for the subtask can be obtained at this time. On the other hand, if the subtask has been processed by other servers, it is based on the principle of applying for a distributed lock before execution. At this time, the cache component will record that the subtask has been applied for a distributed lock by another server. Therefore, the distributed lock of the subtask cannot be successfully acquired at this time. Based on this principle, after obtaining the distributed lock and completing the locking operation, the server will determine that the subtask is the subtask that needs to be executed this time. And will download the corresponding sub-files. Conversely, if the acquisition of the distributed lock fails, the operation of subtask selection will be re-executed to reselect an appropriate subtask.

On the basis of the fourth possible implementation manner of the first aspect, as the fifth possible implementation manner of the first aspect, the process of performing attribute data verification on the second sub-file further includes:

It is judged whether the verification duration of the second sub-file reaches the duration threshold.

If the verification duration on the second subfile reaches the duration threshold, the distributed lock on the second subfile is released.

In order to prevent the failure of the server, the subtask is occupied by itself for a long time, which reduces the execution efficiency of the subtask. The server will count the verification duration of subtasks by itself, and determine whether the duration threshold is reached. If it is reached, it means that the verification of the subtasks by itself has timed out, and it may be that it is faulty. Therefore, the distributed lock on the subfile is released at this time, so that other servers can perform the subtask. Implemented automatic node takeover of subtasks. In turn, the reliability of subtask execution is greatly enhanced.

Based on the first to fifth possible implementations of the first aspect, as the sixth possible implementation of the first aspect, if there is attribute data that fails to be verified in the commodity data, the attribute of the failed verification is acquired The abnormal information of the data, and the abnormal information is stored in the database.

In this embodiment of the present application, the parsed attribute data is checked for validity. That is, it is determined whether each attribute data has problems such as missing data or data errors. When these problems exist, it means that the verification of these attribute data fails. At this time, the embodiment of the present application uploads the abnormality information corresponding to the data parsing abnormality to the database. The abnormal information is recorded by the data. On this basis, the abnormal information can be fed back to the CP, or the CP can query it by itself. Thus, the CP can quickly or which data has problems, and can perform targeted inspection and restocking. Thus, the efficiency of warehousing the commodity data is improved.

Based on the first to sixth possible implementation manners of the first aspect, as a seventh possible implementation manner of the first aspect, the commodity data is data in a data table format.

In the embodiment of the present application, the format of the commodity data is set as a data table format. Since the data table format is a common data recording format, it is the format that many CPs use when they organize commodity data on a daily basis. Therefore, for the CP, if the commodity data is required to be provided in a data table format, the CP only needs to simply organize the original commodity data to obtain the commodity data required by the commodity management system. As a result, the technical threshold and workload requirements for CP are greatly reduced, thereby improving the efficiency of commodity data management.

Based on the first to seventh possible implementation manners of the first aspect, as an eighth possible implementation manner of the first aspect, uploading the attribute data in the second sub-file to the database includes:

In the process of performing attribute data verification on the second sub-file, the attribute data that has passed the verification in the second sub-file is uploaded to the database. Alternatively, after the attribute data verification of the second sub-file is completed, the attribute data that has passed the verification in the second sub-file is uploaded to the database.

In the embodiment of the present application, two schemes for checking and storing commodity attribute data are provided:

Option 1: Validate a single sub-file while checking the product attribute data into the warehouse, and each time the attribute data of a single product is used as the object for verification and storage. (corresponding to the embodiment shown in FIG. 2A )

Option 2: Only after all the attribute data of a single sub-file is verified can the commodity attribute data be put into storage. (corresponding to the embodiment shown in FIG. 3A )

The differences in the beneficial effects of the two programs are as follows:

The operation granularity of scheme 1 is the single item level, while the operation granularity of scheme 2 is the single sub-file level.

During the verification process of a single sub-file in Scheme 1, the server needs to interact with the database multiple times. It needs to consume more network resources, and has higher requirements on the quality of the network connection between the server and the database.

For the solution 1, during the verification process of the sub-file, the attribute data or abnormal information of each commodity in the sub-file is also stored in the database synchronously. If the server is abnormal, the database can also record all the verified commodity attribute data in the current sub-file before the server is abnormal. on the basis of. When the other servers re-verify the sub-file, they can choose to start the verification from the beginning, or they can choose to continue to verify the commodity attribute data in the sub-file that has not yet been put into storage. Therefore, the fault tolerance mechanism of Scheme 1 is more complete and the fault tolerance rate is high.

In solution 2, if the server fails to continue verifying the current sub-file due to an abnormal condition, the database cannot obtain the attribute data in the current sub-file. Therefore, other servers need to re-check the subfile completely. To sum up, for the abnormal situation of the server, compared with Scheme 2, Scheme 1 can theoretically reduce the probability of repeated verification of sub-files, thereby reducing the workload of verifying sub-files, and effectively responding to abnormal situations of the server.

Based on the first to eighth possible implementation manners of the first aspect, as a ninth possible implementation manner of the first aspect, the attribute data in the commodity data includes a commodity image download address, and the method further includes:

Download the product image according to the product image download address included in the verified attribute data.

Perform image feature analysis on commodity pictures to obtain image feature data.

Store image feature data in a feature library.

In this embodiment of the present application, image feature analysis of the in-warehouse commodity is performed and stored in the feature database for use in subsequent user commodity search. Therefore, the embodiments of the present application can provide data support for subsequent commodity searches.

As an embodiment of the present application, image feature analysis is performed on a commodity picture to obtain image feature data, including:

Receive the first commodity picture uploaded by the user terminal.

Perform image feature analysis on the first commodity picture to obtain first image feature data.

Perform image feature analysis on the product image to obtain image feature data.

Considering that in practical applications, when CP takes pictures of commodities, there is a high probability that some objects other than commodities will be photographed. At this point, the product image may contain multiple objects. Therefore, if the feature analysis is performed directly on the product image, the obtained image feature data also contains other objects, which is not conducive to subsequent image matching. Therefore, in this embodiment of the present application, commodity detection is performed on commodity pictures before image feature analysis. The commodity image is then intercepted and analyzed according to the detection result, so that the commodity characteristic data extracted in the embodiment of the present application is more consistent with the commodity itself, and the data is more accurate and reliable. This further improves the accuracy and reliability of subsequent product searches.

A second aspect of the embodiments of the present application provides a commodity search method, which is applied to a server, and the method includes:

Obtain first image feature data of a first commodity picture, where the first commodity picture is a picture uploaded by a user terminal.

From the image feature data stored in the feature library, at least one second image feature data with the highest feature matching degree with the first image feature data is determined.

Send a second product image corresponding to at least one second image feature data one-to-one and attribute data associated with the second product image to the user terminal, wherein the sent second product image and associated attribute data are based on the The second product image and attribute data after sorting the trademark information contained in the first product image.

In the embodiment of the present application, by performing feature matching on the commodity pictures uploaded by the user, an accurate and fast search for the commodities already in the warehouse can be realized. At the same time, the retrieved product data is reordered according to the trademark information in the product image and then fed back to the user, so that the product with a high similarity to the product to be retrieved by the user can be used for attribute data and product images in the user terminal. Show priority. Improve the accuracy and relevance of search results.

On the basis of the first possible implementation manner of the second aspect, as a second possible implementation manner of the second aspect, in the second product image that is one-to-one corresponding to the at least one second image feature data, and the second product image corresponding to the second image feature data Before the attribute data associated with the product image is sent to the user terminal, it also includes:

Obtain the first trademark information contained in the first product image.

Target trademark information of each target product is acquired, where the target product is the product associated with the second image feature data, and the second product image and associated attribute data are the product image and attribute data of the target product.

Sort the second product pictures and attribute data of the target product in descending order of the information matching degree between the target brand information and the first brand information.

In this embodiment of the present application, after matching multiple target commodities with a high degree of correlation, the trademark information in the commodity pictures uploaded by the user is matched with the trademark information of each target commodity. Then sort them in order of matching degree. Thereby, the reordering of target commodities based on trademark information is realized.

Based on the second possible implementation manner of the second aspect, as a third possible implementation manner of the second aspect, the target trademark information includes: the second trademark information and/or the third trademark information.

The second trademark information is the trademark information contained in the second product image associated with the target product.

The third brand information is brand information contained in attribute data associated with the target product.

In this embodiment of the present application, the trademark information of the target commodity may be the trademark information contained in the commodity picture thereof, or may be the trademark information recorded in the attribute data thereof. It is also possible to include both. Therefore, the embodiment of the present application can adapt to various actual situations to obtain the trademark information of the target commodity, so as to ensure the reliability of the matching of the trademark information. In addition, when both are included at the same time, the probability of obtaining the trademark information of the target product can be improved.

On the basis of the first to third possible implementations of the second aspect, as the fourth possible implementation of the second aspect, image feature analysis is performed on the first commodity picture to obtain the first image feature data, including:

Perform image feature analysis on the first commodity picture by using the image feature analysis model completed in advance to obtain first image feature data. The image feature analysis model is a model extracted from a neural network model trained on commodity image samples and attribute data samples based on multiple commodity samples.

In the embodiment of the present application, image feature analysis is performed by using an image feature analysis model obtained after training based on data in two dimensions of commodity pictures and attribute data, which can improve the accuracy of feature analysis. Improve the reliability of subsequent feature matching.

As an embodiment of training the image feature analysis model, it includes:

Preset an initial model.

Obtain product pictures and corresponding product information of multiple sample products, use these product pictures and product information as sample data, and add a classification label corresponding to the sample product to each product picture and each product information.

Feature extraction is performed on commodity information as sample data by using the initial model, and a first loss function for commodity information is calculated according to the extracted text features and corresponding classification labels.

The initial model is used to extract the image features of the product images as sample data, and the second loss function for the product images is calculated according to the image features and the corresponding classification labels.

A third loss function is calculated based on the first loss function and the second loss function, and the initial model is iteratively updated according to the calculated value of the third loss function until a preset convergence condition is satisfied, and a trained model is obtained.

From the trained model, each network used for feature extraction of commodity images is extracted, and an image feature analysis model composed of these extracted networks is obtained.

In the embodiment of the present application, the training method of the classification model is used to separately process the commodity pictures and commodity information of the sample commodities. After the loss function of two dimensions is obtained, the model training of multi-modal fusion is performed. That is, the loss function values of the two dimensions are fused through a new loss function, and the model is iteratively updated based on the loss function value obtained by fusion. Finally, from the trained model, each network used for feature extraction of product images is extracted (ie, the network that discards the feature extraction part of product information) to form a new model for image feature analysis. Practice has proved that the image feature analysis model trained based on this method can achieve more accurate and reliable extraction of product image features, and the obtained image feature data has a better characterization effect on product images. The image feature data extracted based on this image feature analysis model has a high accuracy rate when performing product image matching.

A third aspect of the embodiments of the present application provides a commodity data management system, including: a first server, a second server, and a database.

The first server is used for acquiring commodity data, and dividing the commodity data into at least one first sub-file, wherein each first sub-file contains attribute data of at least one commodity.

The second server is configured to perform attribute data verification on each of the first sub-files, and store the verified attribute data in the database.

In the data management process of the embodiment of the present application, the CP only needs to provide the commodity data according to the format requirements, and then the offline import of the commodity data into the database can be realized. Since CP originally needs to sort out commodity data in practical applications (whether for the purpose of inventory sorting or listing on e-commerce platforms, CP generally needs to sort out commodity data in practical applications), so for CP, only the commodity data needs to be sorted out. The data can be organized according to the format requirements without too much extra work. After receiving the commodity data, the commodity management system will perform data splitting on the commodity data to obtain multiple sub-files (namely, the first sub-file). Attribute data verification will be performed on each sub-file, and the attribute data that has passed the verification will be uploaded. Wherein, the verification of each sub-file by the second server may be serial processing or parallel processing. During parallel processing, the server can perform verification operations on multiple sub-files at the same time, thereby improving verification efficiency.

Compared with the prior art, the embodiment of the present application greatly reduces the technical threshold of CP operation, and has higher usability. At the same time, the automatic verification and data storage of commodity data also greatly improves the management efficiency of commodity data. Corresponding to the embodiment shown in FIG. 2A , in the embodiment of the present application, the first server refers to the execution subject server in S102-S1032. The second server is the execution subject server in S104-S109.

In a first possible implementation manner of the third aspect, attribute data verification is performed on each of the first sub-files, and the verified attribute data is stored in a database, specifically including:

The second server selects one sub-file from the at least one first sub-file as the second sub-file.

The second server performs attribute data verification on the second sub-file, and uploads the verified attribute data in the second sub-file to the database.

After completing the verification of the second sub-file, the second server returns to perform the operation of acquiring one sub-file in the at least one first sub-file until all the first sub-files are verified.

In this embodiment of the present application, the second server will cyclically select and process each sub-file (ie, the second sub-file) from these sub-files, so as to perform data verification on each sub-task, and synchronize the commodities in the sub-files. Data is stored in the database. It realizes the automatic verification and data storage of commodity data, and greatly improves the management efficiency of commodity data.

In addition, the second server may refer to a specific server, or may be any server in a server cluster including multiple servers. When the second server is any server in the server cluster. In the embodiment of the present application, multiple servers can synchronously perform processing and verification of sub-files. Compared with a single server, the embodiment of the present application can greatly improve the verification speed and reliability of sub-files. Therefore, the efficiency of warehousing the commodity data can be improved.

Based on the first possible implementation manner of the third aspect, in the second possible implementation manner of the third aspect, before selecting a subfile from the at least one first subfile as the second subfile, further include:

The first server is further configured to create a first subtask corresponding to the first subfile one-to-one in the database.

The second server selects one subfile from the at least one first subfile as the second subfile, including:

The second server determines a second subtask from the first subtasks stored in the database, and acquires a subfile associated with the second subtask in at least one first subfile.

The second server returns to perform the operation of acquiring one sub-file in the at least one first sub-file until all the first sub-files are verified, including:

The second server returns to perform the operation of determining a second subtask from the first subtasks stored in the database until all the first subtasks are executed and completed.

In this embodiment of the present application, a subtask (ie, the first subtask) corresponding to the subfiles is created in the database through the first server, and the form of the subtask to be executed (ie the second subtask) is determined. To achieve the selection of each sub-file. Therefore, in the embodiment of the present application, the database can effectively record the verification status of each sub-file, and at the same time, the second server can also conveniently determine the sub-file to be verified each time. When the second server is any server in the server cluster, by creating subtasks in the database, it can greatly facilitate the acquisition and verification of subfiles by each server in the server cluster. Further, the processing efficiency of the sub-files is improved.

On the basis of the second possible implementation manner of the third aspect, as the third possible implementation manner of the third aspect, an operation of a second subtask is determined from the first subtask stored in the database, including:

The second server sends a task query request to the database.

In response to the received task query request, the database selects the subtasks to be executed from the first subtasks, and sends the subtasks to be executed to the second server, where the subtasks to be executed include the unexecuted first subtasks , and the first subtask that is being executed and whose execution duration exceeds the duration threshold.

The second server determines the second subtask from the received subtasks to be executed.

In this embodiment of the present application, whether the subtask is a subtask to be executed is determined based on the execution state of the subtask. Considering practical applications, on the one hand, unexecuted subtasks need to be processed by the server. On the other hand, in practical applications, there may be cases where the server cannot process subtasks normally, for example, the server cannot process subtasks normally due to reasons such as downtime. At this time, although the subtask is being executed, it cannot be completed. Even if you continue to wait for the server, the processing of the subtasks cannot be completed, and the verification of the subfiles cannot be realized. Therefore these subtasks need to be reprocessed by other servers. Based on the above two considerations. In this embodiment of the present application, unexecuted subtasks and subtasks that are being executed but whose execution time is overdue are regarded as subtasks to be executed. On this basis, the server obtains all real-time subtasks to be executed, and determines the second subtask to be executed from them.

Based on the third possible implementation manner of the third aspect, as a fourth possible implementation manner of the third aspect, the second server determines the operation of the second subtask from the received subtasks to be executed, include:

The second server sequentially requests the cache component for distributed locks for each subtask to be executed.

When the second server requests a distributed lock for a single subtask to be executed, the subtask to be executed is regarded as the second subtask.

As an optional embodiment of the present application, in order to improve the processing efficiency of subtasks, multiple servers may be used to process each subtask at the same time. At this time, the server that is the execution body of each solution in the first aspect is also a server that processes subtasks. In practical applications, it is found that multiple servers may select the same subtask for processing at the same time. In this case, the processing efficiency of the commodity data is reduced. In order to prevent a single subtask from being repeatedly processed by multiple servers at the same time, in this embodiment of the present application, after receiving the subtask to be executed, the server first selects a subtask from it, and tries to apply for the subtask to the cache component. Distributed lock. Since the distributed lock of a single subtask can only be assigned to a single server. Therefore, if the subtask is not processed by other servers, the distributed lock for the subtask can theoretically be obtained at this time. On the other hand, if the subtask has been processed by other servers, it is based on the principle of applying for a distributed lock before execution. At this time, the cache component will record that the subtask has been applied for a distributed lock by another server. Therefore, the distributed lock of the subtask cannot be successfully acquired at this time. Based on this principle, after obtaining the distributed lock and completing the locking operation, the server will determine that the subtask is the subtask that needs to be executed this time. And will download the corresponding sub-files. Conversely, if the acquisition of the distributed lock fails, the subtask selection operation will be re-executed to reselect the appropriate subtask.

Based on the fourth possible implementation manner of the third aspect, as the fifth possible implementation manner of the third aspect, in the process of performing attribute data verification on the second sub-file, the second server is further configured to:

Based on the fourth possible implementation manner of the third aspect, as the sixth possible implementation manner of the third aspect, the commodity data management system further includes: a cache component.

The cache component is configured to start timing after allocating the distributed lock on the second sub-file to the second server.

The cache component is further configured to release the distributed lock on the second subfile when the timing duration reaches the duration threshold.

In order to prevent the failure of the server, the subtasks are occupied by the server for a long time, which reduces the execution efficiency of the subtasks.

While allocating the distributed lock, the cache component will also time the distributed lock and determine whether the duration threshold is reached. If it is reached, it means that the server has timed out when verifying the subtask, and the server may be faulty. Therefore, the distributed lock on the subfile is released at this time, so that other servers can perform the subtask. Implemented automatic node takeover of subtasks. In turn, the reliability of subtask execution is greatly enhanced.

Based on the first to sixth possible implementations of the third aspect, as a seventh possible implementation of the third aspect, if there is attribute data that fails to be verified in the commodity data, the second server obtains the verification Exception information of the failed attribute data, and store the exception information to the database.

Based on the first to seventh possible implementation manners of the third aspect, as an eighth possible implementation manner of the third aspect, the commodity data is data in a data table format.

Based on the first to eighth possible implementation manners of the third aspect, as a ninth possible implementation manner of the third aspect, attribute data verification is performed on the second sub-file, and the second sub-file is corrected The verified attribute data is uploaded to the database, including:

In the process of verifying the attribute data of the second sub-file, the second server uploads the verified attribute data in the second sub-file to the database. Alternatively, after the attribute data verification of the second sub-file is completed, the second server uploads the attribute data that has passed the verification in the second sub-file to the database.

For the beneficial effects of the embodiments of the present application, reference may be made to the description of the beneficial effects in the eighth possible implementation manner of the first aspect, which will not be repeated here.

On the basis of the first to eighth possible implementation manners of the third aspect, as the ninth possible implementation manner of the third aspect, the attribute data in the commodity data includes the download address of the commodity image, and the method further includes:

The second server downloads the image of the product according to the download address of the image of the product contained in the attribute data that has passed the verification.

The second server performs image feature analysis on the product image to obtain image feature data.

The second server stores the image feature data in the feature library.

A fourth aspect of the embodiments of the present application provides a commodity data management device, including:

The commodity data acquisition module is used for acquiring commodity data, and dividing the commodity data into at least one first sub-file, wherein each first sub-file contains attribute data of at least one commodity.

The storage module is used to perform attribute data verification on each first sub-file, and store the verified attribute data in the database.

In a first possible implementation manner of the fourth aspect, the library module includes:

The file selection module is used for selecting a sub-file from the at least one first sub-file as the second sub-file.

The data verification module is used for performing attribute data verification on the second sub-file, and uploading the verified attribute data in the second sub-file to the database.

After the verification of the second sub-file is completed, the operation of selecting one sub-file from the at least one first sub-file as the second sub-file is returned to execute until all the first sub-files are verified.

The loop module is used to obtain a second sub-file, where the second sub-file is a sub-file selected from at least one first sub-file.

A fifth aspect of the embodiments of the present application provides a commodity search device, including:

The picture receiving module is used for receiving the first commodity picture uploaded by the user terminal.

The image analysis module is configured to perform image feature analysis on the first commodity picture to obtain first image feature data.

The feature matching module is configured to determine at least one second image feature data with the highest feature matching degree with the first image feature data from the image feature data stored in the feature library.

A commodity search module, configured to send a second commodity picture corresponding to at least one second image feature data one-to-one and attribute data associated with the second commodity picture to the user terminal, wherein the sent second commodity picture and the associated The attribute data is the second product image and attribute data sorted based on the trademark information contained in the first product image.

A sixth aspect of the embodiments of the present application provides a server, the server includes a memory and a processor, the memory stores a computer program that can run on the processor, and the processor executes the computer program At the time, the server is made to implement the steps of the commodity data management method according to any one of the above-mentioned first aspects. Alternatively, the server is made to implement the steps of the method for searching for goods according to any one of the above-mentioned second aspects.

A seventh aspect of the embodiments of the present application provides a computer-readable storage medium, including: a computer program is stored, and when the computer program is executed by a processor, the server implements the commodity according to any one of the foregoing first aspects. Steps of a data management method. Alternatively, the server is made to implement the steps of the method for searching for goods according to any one of the above-mentioned second aspects.

An eighth aspect of the embodiments of the present application provides a computer program product, which, when the computer program product runs on a server, causes the server to execute the commodity data management method according to any one of the first aspects above. Alternatively, the server is made to implement the steps of the method for searching for goods according to any one of the above-mentioned second aspects.

A ninth aspect of the embodiments of the present application provides a chip system, the chip system includes a processor, the processor is coupled to a memory, and the processor executes a computer program stored in the memory, so as to implement any of the foregoing first aspects. A method for managing commodity data as described. Alternatively, the server is made to implement the steps of the method for searching for goods according to any one of the above-mentioned second aspects.

The chip system may be a single chip or a chip module composed of multiple chips.

It can be understood that, for the beneficial effects of the fourth aspect to the ninth aspect, reference may be made to the relevant descriptions in the first aspect or the second aspect, which will not be repeated here.

Description of drawings

1A is a schematic diagram of a commodity search interface provided by an embodiment of the present application;

1B is a schematic diagram of a commodity data upload interface provided by an embodiment of the present application;

2A is a system interaction diagram of a commodity management system provided by an embodiment of the present application;

FIG. 2B is a schematic diagram of an application scenario provided by an embodiment of the present application;

FIG. 2C is a schematic flowchart of applying for a subtask distributed lock in the commodity data management method provided by the embodiment of the present application;

2D is a schematic flowchart of sub-file verification in the commodity data management method provided by the embodiment of the present application;

2E is a schematic diagram of an application scenario provided by an embodiment of the present application;

3A is a system interaction diagram of a commodity management system provided by an embodiment of the present application;

3B is a commodity management service architecture diagram of a commodity management system provided by an embodiment of the present application;

3C is an interaction diagram of a commodity management service scenario of a commodity management system provided by an embodiment of the present application;

4A is a schematic flowchart of commodity detection in the commodity data management method provided by the embodiment of the present application;

4B is a schematic flowchart of image search in the commodity data management method provided by the embodiment of the present application;

4C is a schematic diagram of a logical architecture of commodity search provided by an embodiment of the present application;

4D is a schematic diagram of a logical architecture of commodity search provided by an embodiment of the present application;

5A is a system interaction diagram when performing text search in a commodity management system provided by an embodiment of the present application;

5B is a schematic diagram of an application scenario provided by an embodiment of the present application;

5C is a schematic diagram of an application scenario provided by an embodiment of the present application;

FIG. 5D is a schematic diagram of an application scenario provided by an embodiment of the present application;

5E is a schematic diagram of an application scenario provided by an embodiment of the present application;

FIG. 6 is a system interaction diagram during image search in a commodity management system provided by an embodiment of the present application.

Detailed ways

In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are set forth in order to provide a thorough understanding of the embodiments of the present application. However, it will be apparent to those skilled in the art that the present application may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

For ease of understanding, here is a brief description of the embodiments of the present application:

In practical applications, there are some scenarios that need to improve the exposure of products. For example the following scenarios:

Scenario 1: For merchants, they will list products on some e-commerce platforms to achieve product exposure and sales. However, in practical applications, there are a large number of e-commerce platforms, and each e-commerce platform often contains a large number of commodities. As a result, the exposure of merchants' products is often low, and the probability of being known or purchased by users is low.

Scenario 2: For e-commerce platforms and sales websites, a single e-commerce platform may contain a large number of merchants and products listed by merchants. E-commerce platforms can push or display products to users through various recommendation or sorting algorithms. However, the number of these pushed or displayed products often accounts for a very low proportion of the total number of products on the shelves. For example, there may be hundreds of millions of products listed on a single mainstream e-commerce platform, but the number of products pushed or displayed to users may only be thousands. At this time, most of the commodities are difficult to be known or purchased by users. Therefore, the exposure of e-commerce platforms to commodities is low, which is not conducive to the development of e-commerce platforms.

For these scenarios where product exposure needs to be improved, an optional solution is to provide users with a product search service. That is, users can search for products on e-commerce platforms and sales websites according to their actual needs. For example, reference may be made to FIG. 1A , which is a schematic diagram of a product search interface. The user can input the product name in the interface shown in FIG. 1A and search for the product according to the requirements. However, the premise of commodity search is that the content provider (Content Provider, CP) needs to provide commodity data to the commodity management system, and store the commodity data in the database (also known as warehousing), so that the commodity management system can use these commodity data to provide commodity data. The user provides a product search service. For example, please refer to FIG. 1B , which is a schematic diagram of a product data uploading interface. The CP can organize the commodity data according to its own needs and the requirements of the commodity management system, and then upload the commodity data to the database of the commodity management system.

The embodiments of the present application achieve the effects of automatic verification and storage of commodity data. Compared with the prior art, the CP needs to organize the structured commodity data by itself and upload it to the storage bucket of the commodity management system. The embodiment of the present application greatly reduces the technical threshold of CP commodity data operation, and realizes efficient management of commodity data. Thus, the problem of low efficiency of commodity data management due to complicated and cumbersome commodity data operations is avoided.

At the same time, some terms and concepts that may be involved in the embodiments of the present application are described as follows:

Commodity data: It is data composed of attribute data of one or more commodities, for example, commodity data can be composed of commodity names, prices, and links. The quantity of commodities specifically included in the commodity data may be determined by the CP according to the actual situation. In addition, in the embodiment of the present application, when setting the format of the commodity data, the technical personnel can also set the requirements for providing attribute data of the commodity. A request can contain mandatory and optional attribute data. On this basis, CP provides attribute data according to actual needs. Therefore, the type and quantity of attribute data actually included in the commodity data need to be determined according to the attribute data provision requirements set by the technical personnel in the actual application and the data provided by the CP, and there are no excessive restrictions here. In addition, the embodiments of the present application do not limit the content required for providing attribute data too much, which can be set by technical personnel according to actual needs. In this embodiment of the present application, when the CP terminal uploads these attribute data in the form of files, the commodity data may also be called commodity files. It should be noted that, in this embodiment of the present application, the commodity data may be structured data or unstructured data. Because in the embodiment of the present application, the commodity data will be divided into multiple sub-files for verification, and the attribute data that has passed the verification will be stored in the database. Therefore, regardless of whether the commodity data is structured or unstructured data, in the embodiments of the present application, efficient verification and storage of commodity data can be achieved, thereby improving commodity data management efficiency. Correspondingly, when the commodity data is structured data, the process of storing the attribute data in the sub-file into the database is still the process of structured storage of the commodity data.

As an example, it is assumed that the attribute data provision request includes: product serial number (Identity document, ID), category, name, price, picture address, web page link, and application (Application, App) link. The product code, category, name and price are all required attribute data, while picture address, web page link and App link are optional attribute data. And ask for product data in data table format. The image address refers to the download address of the product image. On this basis, the CP can prepare the attribute data of the commodity according to the above requirements, and organize the corresponding data table according to the actually obtained commodity attribute data. For example, the commodity data provided by CP can be as follows in Table 1 (the number of commodities is 4 at this time):

Table 1

Commodity information: Commodity data is the original data provided by CP. In order to realize the management of commodity data, the embodiment of the present application will store the commodity data in a database (that is, put in a warehouse). In order to distinguish the commodity data before and after storage, the embodiment of the present application refers to the commodity data stored in the database as commodity information. It should be noted that when the commodity data includes pictures and other data that cannot be stored in the database in a structured manner, this part of the data is regarded as data other than the commodity information and stored in other places than the database. For example, it can be stored in a network storage platform. The commodity data to be managed at this time consists of commodity information and unstructured data. It should be noted that, in the embodiments of the present application, the commodity information is all text-type information.

Database (DataBase, DB) and warehousing: The database is a data warehouse that organizes, stores and manages data according to the data structure. In this embodiment of the present application, the database is used to store commodity information.

It should be noted that structured data, also known as row data, is data that is logically expressed and implemented by a two-dimensional table structure (that is, structured data is data stored in the form of a two-dimensional table), and strictly follows the data format and length. Specifications are mainly stored and managed through relational databases. In the embodiment of the present application, the commodity information is structured commodity data, that is, it belongs to structured data. Since the database uses a certain data structure for data storage, the process of putting the commodity data into the warehouse is to store the commodity data in accordance with the database structure requirements. It can be seen from this that the warehousing process already includes the structuring of commodity data.

The specific terminal equipment where the database is located is not limited here, for example, it may be located on a single server or in a server cluster. In addition, the type of the database is not limited too much here, and can be selected or set by the technical personnel according to actual needs. For example, it can be Mysql, Oracle or SqlServer. Correspondingly, the structural style of the two-dimensional table storing commodity information in the database can be determined according to the type of the specific database, which is not limited here.

Format (format of commodity data): In order to facilitate structured storage of commodity data (ie, storage in a database), the format of commodity data is set in advance by a technician in this embodiment of the present application. On this basis, the CP needs to organize the attribute data of the product according to this format, so as to obtain the product data that meets the requirements. For example, if the format is a data table, the CP needs to record the attribute data of the commodity into the data table, so as to obtain commodity data in the data table format. The embodiments of the present application do not impose too many requirements on the specific format, which can be set by technical personnel.

As an optional embodiment of the present application, technicians can set the format of commodity data as a data table, and set corresponding attribute data provision requirements (eg, which attribute data needs to be provided, and which can be provided or not provided). At this time, the CP needs to sort out the attribute data of the commodity according to the requirements for providing attribute data, and record the sorted attribute data into the data table.

In some optional embodiments, technicians can implement the settings for the format of commodity data and the provision of attribute data in the form of a data table template. That is, the technical personnel pre-set the required properties in the data table template. As an optional embodiment of the data table template, you can refer to the following table 2:

Table 2

属性1 property 1	属性2property 2	属性3 property 3	属性4attribute 4	属性5attribute 5	属性6attribute 6	属性7property 7

In this embodiment, the first row of the data sheet template is filled in by the technician in advance to fill in each attribute (ie, attribute 1 to attribute 7) that the CP needs to provide. Among them, the specific selected attributes are not limited here, and can be set by technical personnel. For example, it can be the ID, category, name, price, picture address, web page link, and App link in Table 1, or other attributes. On this basis, the CP needs to fill in or import various attribute data of the commodity in the data table template according to the preset attributes in the data table template to complete the input of the data table template.

In addition, it should be noted that the format of commodity data is a data format preset by technicians, and the relationship between it and commodity data storage is described as follows:

In the embodiment of the present application, the format of the commodity data is set by the technician first, and then the CP organizes the attribute data of the commodity according to the format, and obtains the commodity data that satisfies the format. On this basis, the CP uploads the sorted commodity data to the commodity management system, and the commodity management system stores the commodity data. It can be seen that the commodity data uploaded by the CP (also the original commodity data processed by the commodity management system) is the data that meets the format requirements.

CP terminal: refers to the terminal device used by CP to upload product data. This embodiment of the present application does not limit too much the device type of the CP terminal, which can be determined according to actual application scenarios. For example, it can be a desktop computer, a laptop computer, a tablet computer, or a mobile phone. It should be understood that the terminal device that the CP performs commodity data preparation and commodity data uploading may not be the same device. For example, you can use a laptop to prepare product data, and then upload it using a mobile phone. Therefore, in theory, the CP terminal needs to have the ability to upload data, but it does not necessarily need to have the ability to add, delete, modify, verify, and adjust the format of commodity data.

User terminal: refers to the terminal device used by the user to search for goods. In practical applications, users can search for commodities by entering text or pictures in the user terminal and uploading them to the commodity management system. The user may be a consumer or other personnel, such as a CP, which needs to be determined according to the application scenario. This embodiment of the present application does not limit the device type of the user terminal too much, which can be determined according to the actual application scenario. For example, it can be a desktop computer, a laptop computer, a tablet computer, a mobile phone, or a wearable device.

Network Storage Platform (NSP): used for storage of commodity data uploaded by CP terminal, and storage of commodity pictures. In theory, devices with data storage capability and data transmission capability can be used as NSPs in the embodiments of the present application. In practical applications, the specific device type and quantity of NSPs can be selected or set by technicians according to actual needs. For example, it can be a single server with document and picture storage function, or a server cluster with document and picture storage function.

Feature library: In order to realize image search for commodities, the embodiment of the present application may perform image feature analysis on commodity pictures, and obtain image feature data. These image feature data are used for image matching during image search. In this embodiment of the present application, the feature library refers to a data warehouse for storing image feature data. In addition, according to actual needs, other data other than image feature data can also be stored in the feature library. The specific can be set by the technical personnel according to the needs. Wherein, the embodiment of the present application does not limit too much the situation of the terminal device where the feature library is located. It can be selected or set by technicians according to actual needs. For example, it can be in a single server, or in a server cluster. In this embodiment of the present application, the feature library may also be referred to as a commodity base library.

Cache component: Provides distributed lock management services. In this embodiment of the present application, in order to prevent a single task from being executed by multiple servers at the same time, a locking mechanism may be set. Before the server needs to perform a task, it first applies to the cache component for a distributed lock for the task. When the distributed lock for the task is successfully obtained, the locking of the task is completed. The server can now perform the task. Correspondingly, at this time, other servers can no longer apply for the distributed lock of the task from the cache component, and cannot execute the task. In the embodiments of the present application, the implementation method of the distributed lock of the cache component is not limited too much. It can be set by technicians according to actual needs. For example, distributed locks can be implemented based on distributed cache (DCS Redis), or distributed locks can be implemented based on zookeeper. Meanwhile, the embodiment of the present application does not limit too much the situation of the terminal device where the feature library is located. It can be selected or set by technicians according to actual needs. For example, it can exist as a component in the server. In this embodiment of the present application, when a distributed lock is implemented based on a distributed cache, the cache component may also be referred to as a DCS distributed lock.

In order to illustrate the technical solution described in the present application, the following is an example in which the format of the commodity data set by the technician is a data table. The two parts of commodity data management and commodity search by users will be described through specific embodiments.

Part 1: The management operation of commodity data by commodity management system.

In this embodiment of the present application, the commodity management system includes: at least one server, a database, and an NSP. In some optional embodiments, the commodity management system may further include a cache component and a feature library. Fig. 2A shows the system interaction diagram of the commodity management system during data management of commodity data, which is described in detail as follows:

S101, the CP terminal uploads commodity data to the NSP.

For CP, it is first necessary to prepare commodity data according to the format set by the technician. The set format is a data table template as an example for illustration. Assuming the data sheet template provided by the technician is Table 3 below:

table 3

Also set: ID, category, name, price and picture address are required attribute data, and currency ID and picture ID are optional attribute data. Web page link, App link and Quick app link, fill in at least one item.

Among them, ID refers to the product serial number. Category is the category of goods, which can be classified according to different needs. For example, it can be divided into: clothing, digital appliances, shoes, bags, home, toys, beauty, accessories, food and other categories. The currency identification is the identification of the currency to which the price belongs. For example, RMB can be ￥, USD can be $, and GBP can be ￡. The image address refers to the download address of the product image. Considering that in practical applications, CP may provide more product pictures, if uploading one by one at this time, the operation will be more complicated. Therefore, in this embodiment of the present application, the image address attribute is provided. The CP can store the picture of the product in some servers, and fill in the corresponding picture address in the data table template to complete the provision of the picture of the product. Image ID refers to the serial number of the product image. A web page link (weburl) refers to a link of a product sales webpage, and by opening the link, a browser can be opened and a corresponding product sales webpage can be entered. The web page link can be an ordinary web page link or an Html5 web page link. App link refers to the link to the sales page of the product in the App. By opening this link, you can open the corresponding App and jump to the product sales page in the App. The quick app link refers to the link to the sales page of the product in the quick app. By opening this link, you can open the corresponding quick app and jump to the product sales page in the quick app. Among them, one or more links can be filled in the webpage link, App link and quick application link to realize the jump to different e-commerce platforms. For example, three different webpage links are provided, corresponding to the product sales webpages under three different e-commerce platforms. At this time, jumping to different e-commerce platforms can be realized.

On the basis of Table 3, CP can fill in or import commodity attribute data according to the actual situation (in practical applications, CP generally organizes commodity data when carrying out commodity inventory management. Therefore, the original commodity based on CP can be used here. The data is imported into the data table template. At this time, the workload of the CP is very small) to realize the preparation of commodity data. Among them, the actual number of commodities may be a few, or thousands or tens of thousands, which needs to be determined by the CP according to the actual situation. In theory, each product needs to fill in the above attribute data. At this time, each row in the table represents the data of one product. Therefore, the number of rows in Table 3 also needs to be determined according to the actual quantity of commodities. For example, referring to Table 1, the number of commodities is 4 at this time.

As an optional embodiment of the present application, on the basis of Table 3, technical personnel can add or delete attributes according to actual requirements. For example, you can delete the quick application link, or add attributes such as serial number, color, and size, or subdivide the price attribute into two attributes, such as the minimum price and the maximum price. Not too limited here.

As an embodiment of the present application, in order to enrich the description of the product details, on the basis of Table 3, a product description attribute may also be added. At this time, the CP can fill in some descriptions of the product in Table 3, so as to facilitate the user to have an in-depth understanding of the product. For example, you can fill in "this product is green organic food".

After completing the preparation of the commodity data, the CP can upload the commodity data to the NSP through the CP terminal. For example, when the commodity data is a table file in which Table 3 (Table 3 after filling in or importing attribute data) is recorded. The CP can upload the form files to the NSP through devices such as mobile phones or computers.

As an embodiment of the present application, in order to facilitate the CP operation, a portal (Portal) website for uploading commodity data may be preset. In the actual operation of CP, the portal website can be accessed through the CP terminal, and commodity data can be uploaded from the portal website interface. Finish uploading product data.

Wherein, as an optional embodiment of the present application, it is considered that the CP terminal may not be able to directly perform data transmission with the NSP in some cases. In this case, an intermediate device, such as a server, may be set between the CP terminal and the NSP. In actual operation, the server provides a callable application programming interface (Application Programming Interface, API) for the CP terminal. The CP terminal sends the commodity data to the intermediate device by calling the API, and the intermediate device uploads the commodity data to the NSP to complete the uploading of the commodity data.

S102, the server downloads the commodity data from the NSP and splits it to obtain one or more sub-files. Wherein, each sub-file contains attribute data of at least one commodity.

Considering that in practical applications, the number of commodities recorded in commodity data is often large. At this time, if the commodity data is directly checked and stored, the efficiency is low and it is prone to errors. For CP, it takes a long time to know the storage result. Therefore, in order to improve the management efficiency of the commodity data, in the embodiment of the present application, the commodity data will be split, and the commodity data will be divided into multiple sub-files. The format of the sub-file obtained by splitting may be the same or different from that of the product data before the splitting. For example, when the commodity data format is a data table, the sub-file can be a data table or a file in other formats. Details are as follows:

After the commodity data is uploaded to the NSP, the server will download the commodity data from the NSP and split the commodity data into one or more sub-files. Among them, the splitting rules of commodity data are not limited here, and can be determined by technical personnel according to actual needs. Considering that the smaller the amount of data contained in the sub-file, the higher the processing speed, reliability and timeliness of a single sub-file in theory, but at this time, the number of sub-files is large, which will cause the overall processing efficiency to decrease. decline. Therefore, technicians can set split rules according to the actual requirements for the efficiency and reliability of commodity data management. For example, it can be set as: the number of commodities contained in each sub-file is m, where m is a positive integer, such as 1000. At this time, each subfile contains attribute data of m commodities (for the last subfile, the number of commodities may be less than m). For another example, it can also be set that the number of commodities contained in each sub-file is any integer value in [1, n]. The value can be selected randomly or according to certain rules. where n is an integer greater than 1, such as 1000.

By splitting product data into sub-files, it has the following beneficial effects:

1. The sub-file contains less commodity attribute data. Compared with commodity data that contains more commodity attribute data, the server has a lower error probability in processing the sub-file and is more reliable.

2. Since a single product data contains a lot of product attribute data, if it is directly processed by a single server, it will take a long time and the efficiency will be low. By splitting product data into sub-files, the sub-files are then handed over to one or more servers for parallel processing. It can greatly shorten the processing time and improve the processing efficiency of commodity data. For CP, the storage situation of commodity data can be known in a relatively short period of time. Therefore, the CP's experience in using the commodity management system can also be improved.

It should be understood that when the number of commodities contained in a single sub-file in the division rule is greater than or equal to the total number of commodities contained in the commodity data, there will be a situation in which the result of division is only one sub-file (theoretically, there may be no demolition at this time. action). Therefore, the number of sub-files obtained in S102 is one or more. For example, suppose the division rule is set as: the number of items contained in each subfile is 1000. However, the commodity data contains less than 1000 commodities, such as 900. At this time, the attribute data of all products will be divided into the same sub-file.

As an optional embodiment of the present application, considering that the number of commodities contained in a single commodity data may be extremely large in practical applications. For example, it may contain attribute data for thousands of products at the same time. In order to realize the distinction between different commodities, the unique determination of a single commodity is realized. When data is split, a unique identifier can be added to each item in the subfile. The unique identifier can be added to the commodity data as a new attribute data of the commodity. In some optional embodiments, although the CP may provide identifiers such as the ID of the commodity in the commodity data. But for the commodity management system, the behavior of CP is uncontrollable. Practice has proved that the logo provided by CP may also be duplicated, missing, irregular, etc. Therefore, the logo may not be unique, and the credibility is relatively low. However, in the embodiment of the present application, the server adds a unique identifier to each commodity by itself, which can ensure the reliability of the unique identifier, thereby ensuring accurate distinction between various commodities.

In addition, in some optional embodiments, if the CP is required to provide commodity data in the form of a data table template. At this time, the attribute data of a single product are in the same row, that is, each row is all the attribute data of a product. Therefore, the generated unique identifier can be added to the data table template as the row number attribute data of the product. At this time, the line number of the product in the product data is the unique identifier of the product. The embodiments of the present application do not limit the type and generation method of the unique identifier too much, which can be set by technical personnel. For example, the unique identifier can be formed by the upload time of the product data and the serial number of the product. It can also be a randomly generated non-repeating string, which is used as the unique identifier of a single product. In addition, the length of the unique identifier can be set, for example, it can be set to a fixed length of 16 bits. When generating the unique identifier, if it is less than this length, it will be filled with spaces, or filled with 0s.

S103, the server stores all the sub-files in the NSP, and obtains the download address of each sub-file in the NSP. Based on the download address, subtasks corresponding to subfiles are created in the database, and a parent task containing these subtasks is created at the same time.

Among them, S103 can be subdivided into S1031 and S1032:

S1031, the server stores all the sub-files in the NSP, and obtains the download address of each sub-file in the NSP.

S1032: Based on the download address, the server creates sub-tasks corresponding to the sub-files one-to-one in the database, and creates a parent task including these sub-tasks at the same time.

After the product data is split and one or more sub-files are obtained. The server that performs the split operation will upload all the obtained sub-files to the NSP uniformly. At the same time of storage, the download address of each sub-file in NSP will be obtained. The NSP and the server that executes S103 are mutually independent devices.

After obtaining the download address of each subfile, the server will create a subtask corresponding to each subfile one-to-one in the database, and store the download address of each subfile in the corresponding subtask. In this embodiment of the present application, the subtask can be executed by the server. The essence of executing the subtask is: the server downloads the subfile corresponding to the subtask through the download address in the subtask, and checks and stores the attribute data in the downloaded subfile. By executing each sub-task, the server can effectively process each sub-file, and finally realize the storage of commodity data.

In practical applications, each subtask needs to be executed to realize the complete storage of commodity data. Therefore, in the process of executing the subtasks, the server needs to confirm whether the subtasks under the single commodity data are all executed and completed (the confirmation method can be set by the technicians, which is not limited here). In practical applications, the commodity management system may need to process multiple commodity data at the same time. Therefore, for the database, it may store subtasks corresponding to multiple commodity data at the same time. At this time, the number of subtasks is large, and the management is more difficult. In order to facilitate the management of subtasks under each commodity data, the server also creates a parent task containing these subtasks when creating subtasks. Therefore, each product data corresponds to a parent task, and a single parent task contains all subtasks under the corresponding product data. When confirming whether all subtasks under a certain commodity data have been executed, query the execution status of each subtask in the parent task corresponding to the commodity data. Therefore, the efficiency of subtask management can be improved.

In addition, the embodiment of the present application further records the execution status of each subtask. In the embodiment of the present application, the execution status of the subtask includes three types: not executed, executing, and executing completed. When a subtask is executed by the server, the database will update the execution status of the subtask synchronously.

Among them, not executed means that the subtask is not currently executed by any server. Executing means that the subtask is currently being executed by at least one server, and no server has completed the subtask. Execution complete means that the subtask has been executed by at least one server. Because in the embodiment of the present application, the essence of executing the subtask is to check and store the attribute data in the subfile corresponding to the subtask. Therefore, the fact that it is not executed means that the attribute data in the subfile corresponding to the subtask has not been checked and stored. Executing means that the attribute data in the corresponding subfile in the subtask is being checked and stored. The execution completion means that the attribute data in the sub-file corresponding to the sub-task has been checked and stored. For the subtasks just created, the execution status will be marked as not executed in the database.

Take an example to illustrate. Suppose that sub-file a and sub-file b are obtained after the product data A is split. At this time, the server will store the two sub-files in the NSP, and obtain the corresponding download addresses of the two sub-files. Suppose the download address of sub-file a in NSP is: https://xxxhuawei.com/filea/huawei.html, and the download address of sub-file b in NSP is: https://xxxhuawei.com/fileb/huawei.html . At this time, the server will create subtask a and subtask b in the database, as well as a parent task A that contains both subtasks (the parent task may have no substantive task content, only the included subtasks are recorded). At the same time, the download address: https://xxxhuawei.com/filea/huawei.html is stored in subtask a, and the download address: https://xxxhuawei.com/fileb/huawei.html is stored in subtask b.

As an optional embodiment of the present application, in order to facilitate the distinction of each subtask, when creating a subtask, an identifier or ID may be added to each subtask. When the database and the server interact, they can uniquely determine the subtask by informing each other of the subtask identifier or ID.

As an optional embodiment of the present application, after the server stores the sub-file in the NSP, the server may delete the commodity data uploaded by the CP terminal in the NSP to save NSP storage space.

As an optional embodiment of the present application, considering that there may be multiple CPs uploading their respective commodity data in practical applications. At this time, in order to improve the processing effect of commodity data, multiple servers can be used at the same time. At the same time, distributed locks are introduced to avoid wasting server resources.

For scenarios where multiple servers are used for product data splitting and task creation, in order to prevent single product data from being repeatedly processed. The embodiments of the present application introduce a locking mechanism for distributed locks. That is, in S102, the server firstly performs a distributed lock for a single commodity data. And only after the distributed lock is obtained (that is, the lock is completed), operations such as downloading and splitting the product data will be performed. At this time, S102 can be replaced with: S1021, the server obtains the distributed lock for the commodity data from the cache component. If the distributed lock is obtained, the commodity data is downloaded from the NSP and split to obtain one or more sub-files. Wherein, each sub-file contains attribute data of at least one commodity.

As an optional embodiment of the present application, refer to FIG. 2B . In the embodiment of the present application, multiple servers are used to be responsible for splitting commodity data, so as to realize task decomposition. At the same time, in order to prevent single commodity data from being repeatedly processed by multiple servers, distributed locks are also introduced. Details are as follows:

Each server queries a task list that records commodity data to be processed.

If it is found that there is commodity data that needs to be processed, each server applies for a distributed lock for the commodity data respectively. That is, to grab the lock.

The server that successfully grabs the lock will act as the execution body of S102-S103 to download commodity data from NSP.

After the product data is downloaded, the sub-files are split to obtain multiple sub-files.

All the obtained sub-files are stored in the NSP, and at the same time, based on the download address of the sub-files, sub-tasks corresponding to each sub-file are created in the database.

In addition, when the sub-file is split, a line number can also be added to each commodity in the sub-file as a unique identifier.

S104, the server sends a task query request to the database.

After the subtasks are created in S102-S103, the embodiment of the present application starts to process the subtasks to verify each subfile corresponding to the subtasks. In order to realize the processing of the subtasks, the server first sends a task query request to the database, and the task query request is used to request the database to inform the server of the subtasks to be executed under the current parent task. Among them, the data content and format of the task query request, etc., are not limited here, and can be set by technical personnel according to requirements.

S105 , after receiving the task query request, the database filters out subtasks to be executed from all subtasks included in the parent task, and returns the screened subtasks to the server in the form of a subtask list.

In this embodiment of the present application, whether the subtask is a subtask to be executed is determined based on the execution state of the subtask. Specifically, the execution state corresponding to the subtask to be executed is preset by the technician. On this basis, after receiving the task query request, the database will identify the execution status of each subtask under the parent task, and screen out the subtasks to be executed whose execution status meets the requirements. For example, all subtasks whose execution status is not executed may be regarded as subtasks to be executed. At this time, the database will regard all subtasks whose execution status is not executed under the parent task as subtasks to be executed.

In addition, according to actual needs, technicians can also add some other screening conditions on the basis of the execution status, so as to achieve accurate distinction and screening of subtasks to be executed. For example, on the basis of the execution state, a limit on the execution time of the task can also be increased. At this time, the database will obtain the execution status and execution duration of the subtasks at the same time, and screen out subtasks whose execution status and execution duration both meet the preset requirements, as subtasks to be executed.

As an optional embodiment of the present application, according to the difference between the execution state and the execution duration, the subtasks to be executed have the following optional ranges:

1. Include all unexecuted subtasks.

2. Contains all unexecuted subtasks and all executing subtasks.

3. Include all unexecuted subtasks and partially executed subtasks.

In practical applications, technicians can select any one of the above three ranges as the subtask range to be executed according to requirements. You can also set the scope of subtasks to be executed according to your needs. Not too limited here.

as an optional embodiment of the present application. Considering practical applications, on the one hand, unexecuted subtasks need to be processed by the server. On the other hand, in practical applications, there may be cases where the server cannot process subtasks normally, for example, the server cannot process subtasks normally due to reasons such as downtime. At this time, although the subtask is being executed, it cannot be completed. Even if you continue to wait for the server, the processing of the subtasks cannot be completed, and the verification of the subfiles cannot be realized. Therefore these subtasks need to be reprocessed by other servers. Based on the above two considerations. In this embodiment of the present application, unexecuted subtasks and subtasks that are being executed but whose execution time is overdue are regarded as subtasks to be executed. At this time, the filtering operation of the subtasks to be executed in S105 can be replaced with:

After receiving the task query request, the database filters out unexecuted subtasks from all subtasks included in the parent task, and subtasks that are being executed and whose execution duration exceeds the duration threshold.

Wherein, in order to measure whether the subtask times out, a duration threshold is preset in this embodiment of the present application. Subtasks whose execution duration exceeds the duration threshold will be determined as execution duration timeout.

After the subtasks to be executed are screened out in S105, the embodiment of the present application will put these subtasks in the same list (ie, the subtask list), and feed back the subtask list to the server that sends the task query request. Wherein, in the subtask list, each subtask is recorded with a corresponding subfile download address, so that the server can download the subfile for processing. Wherein, if only one to-be-executed subtask that meets the requirements is screened out, the subtask list may not be sorted out at this time, but the screened subtask may be directly returned to the server. It is also possible to return a subtask list containing only one subtask. The details can be set by the technicians themselves.

As an optional embodiment of the present application, for convenience, the server determines the subtask to be executed according to the subtask list. In this embodiment of the present application, the database records the creation time of each subtask. After the subtasks are filtered out, the subtasks will be prioritized according to their creation time and execution status, and a subtask list will be generated according to the sorting results. Then, the sorted subtask list is fed back to the server.

The specific subtask priority sorting rules are not limited here, and can be set by technical personnel according to actual needs. For example, it can be set that the priority of subtasks that are not executed is higher than that of subtasks that are being executed. Among them, the priority of the subtasks that are not executed decreases from first to last according to the creation time, and the priority of the executing subtasks also decreases from the first to the last according to the creation time.

In addition, it should be noted that, as an optional embodiment of the present application, the logic of each operation to be performed by the database in S105 can be built into the database, or can be built into the terminal device where the database is located in the form of a program. Specific can be set by technical personnel according to actual needs. When built into the database, the operation of S105 can be completed by the database itself. When the program is built into the terminal device where the database is located, the terminal device completes the operation of S105.

S106: After receiving the subtask list, the server determines a subtask from the subtask list, and downloads a subfile corresponding to the subtask from the NSP according to the download address in the determined subtask.

After receiving the subtask list, the server will select a subtask as the subtask to be executed this time. At the same time, after the subtask selection is completed, the server will also download the corresponding subfile from the NSP according to the download address in the subtask to perform subsequent data verification. Wherein, the embodiment of the present application does not limit the selection method of subtasks too much, which can be set by technical personnel according to the actual situation. For example, in some optional embodiments, the first subtask in the subtask list may be selected, or a subtask may be randomly selected. In some embodiments, prior to sending the subtask list, the database has prioritized each subtask in the subtask list. At this time, the server can select the subtask with the highest priority for processing. For example, when the priority is sorted in descending order, the first subtask can be selected for processing.

As an embodiment of the present application, after determining the subtask to be executed this time, the server will also inform the database that the subtask is currently being executed (this can be achieved by sending an instruction carrying the subtask ID and execution status to the database) ). To help the database update the execution status of the subtask and provide data for calculating the execution time. Correspondingly, the database obtains the message that the subtask is selected to be executed by the server. The execution status of the subtask will be set to: executing, and the time when this message is learned is set to the last update time (last_update_time) of the subtask. The execution time of the subtask is equal to the difference between the current time and the last update time, that is, now()-last_update_time. Among them, for the subtasks that are already in execution, it is only necessary to update the time when the subtasks start to be executed, and there is no need to change the execution status.

As an optional embodiment of the present application, in order to improve the processing efficiency of subtasks, multiple servers may be used to process each subtask in the parent task at the same time. However, in practical applications, it is found that multiple servers may select the same subtask for processing at the same time. In this case, the processing efficiency of the commodity data is reduced. To prevent a single subtask from being processed repeatedly by multiple servers at the same time. In this embodiment of the present application, distributed locks are introduced to perform operations. Specifically, referring to FIG. 2C, at this time S106 can be replaced with:

S1061, after receiving the subtask list, the server selects a subtask from the subtask list without repetition, and after selecting the subtask, applies to the cache component for a distributed lock on the subtask.

S1062, if the distributed lock of the subtask is successfully acquired, the server stops the selection operation for the subtask, and downloads the subfile of the subtask from the NSP according to the download address in the subtask.

S1063, if the distributed lock of the subtask is not successfully acquired, the server returns to select a subtask that is not repeated from the subtask list, and after the subtask is selected, applies to the cache component for a lock on the subtask Operation of distributed locks.

In the embodiment of the present application, after receiving the subtask list, the server first selects a subtask from the subtask, and tries to apply for a distributed lock to the subtask from the cache component. The server may inform the cache component which subtask distributed lock is applying for this time by sending the identifier or ID of the subtask to the cache component.

Since the distributed lock of a single subtask can only be assigned to a single server. Therefore, if the subtask is not processed by other servers, in theory, the distributed lock for the subtask can be obtained at this time. On the other hand, if the subtask has been processed by other servers, it is based on the principle of applying for a distributed lock before execution. At this time, the cache component will record that the subtask has been applied for a distributed lock by another server. Therefore, the distributed lock of the subtask cannot be successfully acquired at this time. Based on this principle, after obtaining the distributed lock and completing the locking operation, the server will determine that the subtask is the subtask that needs to be executed this time. And will download the corresponding sub-files. On the contrary, if acquiring the distributed lock fails, the operation of subtask selection in S1061 will be re-executed to reselect an appropriate subtask.

The embodiment of the present application does not limit the selection method of subtasks too much, and theoretically, it is only necessary to not repeat the selection. For example, in some optional embodiments, it may be selected randomly or sequentially. as an optional embodiment of the present application. If the database has prioritized each subtask in the subtask list before sending the subtask list. At this time, the server can select subtasks in sequence according to the order of priority from high to low. For example, when sorting is based on the order of priority from high to bottom, the selection method may be set as: from the sub-tasks that have not been selected in the task list, select a sub-task that is ranked first. The embodiment of the present application can prevent the situation that a single subtask is not processed for a long time, and can improve the efficiency of processing the subtask.

In addition, in order to realize the timely update of the execution status of subtasks. In this embodiment of the present application, after acquiring the distributed lock, the server S1062 further informs the database that the current subtask is being executed. To help the database update the execution status and execution time of the subtask.

S107, the server performs attribute data verification on each commodity in the sub-file. And store the attribute data of the products that have passed the verification to the database.

If there are commodities that have not passed the verification, the abnormal information of the attribute data of these commodities is recorded, and the abnormal information is stored in the database.

After acquiring the sub-file, the server starts to verify the attribute data of each commodity in the sub-file. That is, check whether the attribute data of the product meets the preset requirements. Among them, there may be some differences in the requirements for commodity attribute data in different practical application scenarios. For example, in some possible scenarios, in order to adapt to the display effect of mainstream terminal devices, the requirements for pictures of commodities are relatively strict. At this time, the format and size of the image may be more strictly required. In other possible scenarios, in order to provide users with more comprehensive product data, there may be higher requirements on the types and quantity of product attribute data. Therefore, the embodiments of the present application do not limit the specific data verification requirements too much. In practical applications, the requirements for commodity attribute data can be preset by technical personnel according to the requirements of actual application scenarios. The server then verifies the attribute data of each commodity in the sub-file according to the preset requirement.

In addition, in the case that the sub-file contains multiple commodities, a method of verifying the attribute data of only one commodity at a time can be selected to realize the verification of the attribute data of each commodity. You can also choose to perform concurrent processing on multiple commodities, that is, perform attribute data verification on multiple commodities at the same time each time, so as to improve the verification efficiency. The specific can be set by the technical personnel according to their own, and there is no excessive limitation here. The verification rules for each attribute data of a single product are not limited here, and can be set by technicians. For example, in some optional embodiments, it may be set to sequentially determine whether each attribute data of a commodity meets the requirements. In other optional embodiments, it can also be set to determine whether multiple attribute data meet the requirements at the same time. In this case, the verification efficiency can be improved.

On the basis of setting requirements for commodity attribute data and verification rules, the embodiment of the present application will take a single commodity as an operation object to verify attribute data. Therefore, in theory, each product in the sub-file will have a corresponding verification result. In the embodiment of the present application, the verification results of a single commodity can be divided into two categories:

1. All attribute data of the product have been verified (referred to as verified).

2. There is an abnormality in the attribute data of the product, so that the verification fails (referred to as verification failure).

For commodities that have passed the verification, in this embodiment of the present application, the server will store the attribute data of the commodities in the database. Since the database stores data in a structured manner, the process of storing attribute data to the database is a structured storage process for the attribute data. Among them, if the attribute data includes a product image, or includes the download address of the product image. Then, in this embodiment of the present application, the corresponding commodity picture will be stored in the NSP.

For products that fail the verification, the server will record the abnormal information corresponding to the product, such as what attribute data is abnormal for the product. And feedback the exception information to the database. In order to realize the record of abnormal commodity attribute data, it is convenient for CP to re-upload commodity attribute data according to the abnormal information, and improve the efficiency of commodity data management.

An example of performing sub-file data verification is given as an example. Assuming that the format of commodity data is a data table, each row in the data table is used to record all attribute data of a commodity. And the attribute data that must be provided include: product ID, name, category, picture address and web page link. At the same time, it is assumed that when the server splits the commodity data in S102, the split sub-file is also in a data table format. And a unique ID will be generated for each item in the subfile, and the ID will be added to the item's row as the item's row number. On this basis, referring to FIG. 2D , the data verification operation on the sub-file may include: S1071-S10710.

S1071, the server reads one line of data in the sub-file, and splits the read one line of data into a first line number and first data.

In this embodiment of the present application, only the attribute data of a single commodity is verified each time. So the server will read a single row of data at a time. As an optional embodiment of the present application, in order to prevent the attribute data of the same commodity from being verified multiple times, it may be set as a non-repetitive reading operation each time. At this point, the operation of the server to read a line of data in the subfile can be replaced by:

The server reads a line of data within the subfile without repetition.

After reading the attribute data of a single commodity, the embodiment of the present application will first extract the row number (ie, the first row number) therein, to obtain the row number of the commodity and the remaining attribute data.

Because the attribute data required to be provided in the embodiment of this application includes: the ID, name, category, picture address and web page link of the product. Therefore, if the CP provides product data according to this requirement, theoretically, the remaining attribute data at this time is the ID, name, category, picture address and web page link of the product.

S1072, the server determines whether the first line number has been processed. If processed, return to S1071 to continue processing the next line of data. If it has not been processed, S1073 is executed.

Considering practical applications, there may be some unexpected situations, so that a single row of data may be processed multiple times by a single server. For example, a single line of data is repeatedly divided into multiple subfiles, and these subfiles containing the same line of data are processed by the same server. Another example is the scenario in S1071 that the server repeatedly reads the same row of data (in this case, no non-repeat reading is set). In this case, the attribute data of a single product will be repeatedly verified, which reduces the efficiency of product data verification. To deal with these unexpected situations, consider that the line number is the unique identifier of the item. Therefore, after parsing the line number of the product, the server will first determine whether it has processed the line number.

If it has been processed, it means that the attribute data of the product has been verified before. No further verification is required at this time. Therefore, a row of data will be reselected, and the number of newly selected rows will be checked.

If it has not been processed, it means that the verification of commodity attribute data is required. So the next steps will continue at this point.

S1073, the server performs attribute data analysis on the first data. If the parsing fails, it is determined that the current row data is abnormal, the corresponding abnormal information is uploaded to the database, and the execution returns to S1071. If the parsing is successful, all attribute data contained in the first data are obtained, and S1074 is executed.

After the line number verification is passed, the embodiment of the present application will start to perform attribute analysis on the first data to determine whether the text format of the first data is legal. If it can be parsed normally and get the ID, name, category, picture address and web page link of the product. It means that the text format of the first data is legal. On the other hand, if the parsing fails, it means that the text format of the first data is illegal and cannot be parsed and restored normally. In the case of inability to parse, the embodiment of the present application uploads the abnormality information corresponding to the data parsing abnormality to the database. The abnormal information is recorded by the data to feed back the abnormal situation to the CP, help the CP to quickly locate the abnormal product, and re-provide the attribute data of the corresponding product.

Wherein, the embodiment of the present application does not limit too much the data type of the abnormal information, which can be set by a technician according to actual needs. For example, specific abnormal conditions, such as "data parsing abnormality", can be described in text form, and the text can be used as abnormal information. For another example, corresponding exception codes may be set in advance for various possible abnormal situations, and the corresponding exception codes may be used as exception information. For example, the abnormal code corresponding to the parsing failure can be set to 2203, and the abnormal information can be 2203 in this case. It can also be in the form of exception code and text as exception information. The same is true for the abnormal information data types in the following steps, which will not be repeated in this embodiment of the present application.

S1074, verify the validity of each attribute data obtained by parsing. If there is attribute data for which the validity check fails, it is determined that the current row data is abnormal, the corresponding abnormal information is uploaded to the database, and the process returns to execute S1071. If the validity check of each attribute data is passed, then execute S1075.

After the parsing is successful, the embodiments of the present application respectively perform legality verification on the parsed ID, name, category, picture address, and web page link. That is, it is determined whether each attribute data has problems such as missing data or data errors. For example, for the category, it is assumed that it is pre-divided into: clothing, digital appliances, shoes, bags, home, toys, beauty, accessories, food and other categories. At this time, the embodiment of the present application will verify whether the filled-in category data belongs to these categories. If it belongs, it can be judged that the validity check of the category is passed. If not, it is determined that the verification fails. For example, assuming that "pants" is filled in, and it does not belong to the above classification, it is determined that the verification fails. Or fill in "digital", and the filling is incomplete at this time, that is, there is data missing. It is also a verification failure. When the verification fails, the embodiment of the present application uploads the exception information corresponding to the data parsing exception to the database. The abnormal information is recorded by the data.

Among them, considering that there are many kinds of attribute data, the embodiments of the present application do not limit too many rules for the validity verification of these attribute data. It can be set by technicians. For example, it can be set to verify each attribute data in sequence, or to verify multiple attribute data at the same time. And it is set to stop the verification and determine that the current row data is abnormal when the validity verification of attribute data fails.

As an optional embodiment of the present application, the abnormal information corresponding to the failure of the validity verification may be the text "commodity parameter verification failed". Or exception code 2204. It is also possible to include both.

S1075, according to the ID in the attribute data, determine whether the commodity corresponding to the row data already exists. If the commodity already exists, it is determined that the current row data is abnormal, the corresponding abnormal information is uploaded to the database, and the execution returns to S1071. If the commodity does not exist, execute S1076.

Considering the practical application, if there are many commodities. When the CP organizes the product data, the attribute data of the same product may be recorded to the product data multiple times. In this case, the server may repeatedly verify the same product, which reduces the efficiency of verification. Therefore, after the verification of the attribute data is completed in the embodiment of the present application, the server will, according to the ID of the commodity, determine whether to determine whether it has processed the commodity of the ID.

If it has been processed, it means that the attribute data of the product has been uploaded repeatedly, and the attribute data of the product has been verified before. At this point, no further verification is required. Therefore, a row of data will be reselected, and the number of newly selected rows will be checked. And the abnormal information corresponding to the abnormal row data will be uploaded to the database. The abnormal information is recorded by the data.

If it has not been processed, it means that the attribute data verification of the product needs to be continued. So the next steps will continue at this point.

As an optional embodiment of the present application, the abnormal information corresponding to the commodity already exists, which may be the text "the commodity already exists". Or exception code 2202. It is also possible to include both.

S1076, download the product image according to the image address. If the download fails, it is determined that the current row data is abnormal, the corresponding abnormal information is uploaded to the database, and the execution returns to S1071. If the download is successful, execute S1077.

After confirming that the product has not been processed, the embodiment of the present application will try to download the product image according to the image address.

If the download fails, there is a problem with the image address. For example, the image address may be wrong, or the image source has been deleted. At this time, the embodiment of the present application uploads the abnormal information corresponding to the picture address to the database. The abnormal information is recorded by the data.

If the download is successful, the download address is correct, and the subsequent verification will continue at this time.

As an optional embodiment of the present application, the abnormal information corresponding to the download failure may be the text "image download failed". Or exception code 2303. It is also possible to include both.

S1077: Determine whether the volume of the downloaded product image exceeds a preset volume threshold. If the volume threshold is exceeded, the current row data is abnormal, the corresponding abnormal information is uploaded to the database, and the process returns to S1071. If the volume threshold is not exceeded, execute S1078.

In order to prevent the product images from being too large, the NSP storage space is occupied too much, and problems such as being inconvenient for users to download and view occur. In this embodiment of the present application, a volume threshold is preset, and the volume of the commodity image provided by the CP is required not to exceed the volume threshold. Wherein, the specific value of the volume threshold can be set by technical personnel according to actual needs. For example, it can be set to 2 MB. Therefore, after downloading the product image, this embodiment of the present application will determine whether the volume of the product image exceeds the volume threshold.

If it exceeds, it means that the image size does not meet the requirements. At this time, the server will determine that the image volume is abnormal, and upload the abnormal information corresponding to the image volume to the database. The abnormal information is recorded by the data.

If it does not exceed, it means that the size of the picture meets the requirements. At this point, subsequent verification will continue.

As an optional embodiment of the present application, the abnormal information corresponding to the image volume exceeding the volume threshold may be the text "image is too large". Or exception code 2305. It is also possible to include both.

S1078: Identify whether the format of the commodity picture belongs to the first format. If it does not belong to the first format, it is determined that the current row data is abnormal, the abnormal information is recorded, and the execution returns to S1071. If it belongs to the first format, execute S1079.

Taking into account practical applications, pictures have a wide variety of formats. However, the image formats supported by a single server and terminal device are often limited. Therefore, in order to prevent the situation that the image format is not supported, the server cannot process the image of the product, or the user cannot view the image of the product normally. This embodiment of the present application will continue to verify whether the format of the product image is legal. Wherein, in this embodiment of the present application, one or more formats are preset as legal formats (ie, the first format). At this time, it is to identify whether the format of the product image is a legal format. If it belongs, it is judged that the verification is passed. If it does not belong, it is determined that the verification fails, and the image format of the image product is abnormal. And the abnormal information corresponding to the image format will be uploaded to the database. The abnormal information is recorded by the data.

For example, assume that legal formats are set to include: jpg, png, bmp, and gif. At this time, if the product image format belongs to any of these categories, the format is considered legal. Anyway, if it is not in the format, it is considered that the format of the product image is abnormal and the verification fails.

As an optional embodiment of the present application, the picture format does not belong to the abnormal information corresponding to the legal format, and may be the text "unsupported picture format". Or exception code 2304. It is also possible to include both.

S1079, upload the product image to the NSP. If the upload fails, it is determined that the image upload is abnormal, the abnormal information is recorded, and the process returns to S1071. If the upload is successful, execute S10710.

After completing the verification of the volume and format of the commodity picture, the embodiment of the present application uploads the commodity picture to the NSP for storage.

When the upload fails, there is a problem with the data transmission between the server and the NSP. For example it could be that the NSP is broken, or the network between the two is unstable. At this time, the embodiment of the present application will upload the abnormal information corresponding to the failure to upload the picture to the database. The abnormal information is recorded by the data.

If the upload is successful, continue with subsequent operations.

As an embodiment of the present application, in order to improve the probability of successful uploading. If the upload fails, you can try uploading multiple times. If multiple uploads fail, then determine that the image upload is abnormal.

As an optional embodiment of the present application, the abnormal information corresponding to the image upload failure may be the text "system abnormality". Or exception code 1001. It is also possible to include both.

S10710, upload the first data to the database. If the upload fails, it is determined that the data storage is abnormal, the abnormal information is recorded, and the execution returns to S1071. If the upload is successful, it is determined that the current row data verification is passed, and the process returns to execute S1071.

After uploading the image of the product, the embodiment of the present application uploads the attribute data of the product to the database.

When the upload fails, there is a problem with the data transfer between the server and the database. For example, it may be that the network between the two is unstable. At this time, the embodiment of the present application uploads the abnormal information corresponding to the failure to upload the attribute data to the database. The abnormal information is recorded by the data. At the same time, a commodity will be re-selected from the sub-file as the object to verify the attribute data. Among them, due to the small amount of abnormal information, compared with attribute data, the probability of uploading to the database is relatively high.

If the upload is successful, the storage operation of the attribute data of the currently verified product is completed. At this time, a commodity will be re-selected from the sub-file as the object to verify the attribute data.

As an optional embodiment of the present application, the abnormality information corresponding to the failure to upload the attribute data may be the text "system abnormality". Or exception code 1001. It is also possible to include both.

Through the operations of S1071-S10710, the data of each line in the sub-file can be processed one by one. Further, the verification of the attribute data of each commodity in the sub-file is realized. Correspondingly, if it is determined to be abnormal (including abnormal row data, abnormal image upload, and abnormal data storage) in any step of S1073-S10710, the embodiment of the present application will determine that there is abnormality in the attribute data of the currently verified product. . That is, the current verification product verification fails. If the verification of S1073-S10710 is successful, it will be determined that the current verification of the commodity has passed the verification.

As an embodiment of the present application, after completing the verification of the attribute data of each commodity in the sub-file, the verification of the current sub-file can be stopped in time. Allows the server to continue performing other tasks. After S1071, also include:

S10711, if all the line data in the sub-file has been read, it is determined that the verification of the sub-file is completed.

In the embodiment of the present application, the verification is performed sequentially by checking the line number, attribute data format, attribute data validity, ID, product image download, product image volume, product image upload, and attribute data storage. In this way, the complete and reliable verification of commodity attribute data is realized. At the same time, it also realizes the storage of attribute data and the accurate recording of abnormal information.

As an optional embodiment of the present application, when attribute data or exception information is stored in the database, a unique identifier of the processed commodity, such as a line number, may be sent to the database at the same time. At this time, if the sub-file cannot be verified due to unexpected factors during the verification of the sub-file, the execution of the sub-task is timed out. When other servers execute this subtask, they can continue to verify from the last verified commodity according to the line number. Further, the efficiency of verification is improved, and the work of repeated verification of commodity attribute data is reduced.

As an optional embodiment of the present application, if combined with the embodiment shown in FIG. 2C , a method of applying for a distributed lock to a subtask is used to prevent a single subtask from being executed multiple times. At this time, in order to prevent the server from malfunctioning, the subtask is occupied by itself for a long time, which reduces the execution efficiency of the subtask. In this embodiment of the present application, two optional coping methods are provided:

1. After the server acquires the distributed lock for the subtask, it will start timing. When the timing duration reaches the duration threshold and the subtask is still not completed. The server will actively inform the cache component to release the distributed lock on the subtask. At this time, other servers can apply for the distributed lock of the subtask again and process the subtask.

2. The cache component will start timing after allocating the distributed lock corresponding to the subtask to the server. When the timing duration reaches the duration threshold, the distributed lock on the subtask will be released actively. That is, the forced unlocking of the distributed lock of the subtask. At this time, any server can apply for the distributed lock of the subtask again and process the subtask.

In practical applications, technicians can choose any one or both of the above-mentioned coping methods to apply, to realize automatic unlocking of distributed locks whose subtasks have timed out. So that a single subtask can be automatically released in time when the execution is abnormal, and the result can be obtained by other servers. Implemented automatic node takeover of subtasks. In turn, the reliability of subtask execution is greatly enhanced.

As another optional embodiment of the present application. Considering the practical application, when the amount of subtask data is large, it takes a long time to complete the subtask. When the required time is longer than the duration threshold, if the database learns the time when the subtask is selected to be executed by the server as the last update time, there will be a situation where the subtask is being executed normally, but it is judged to be overtime.

For example, suppose it takes 6 minutes to execute subtask A normally, and the duration threshold is set to 5 minutes. At the same time, it is assumed that the database knows that subtask A starts to be executed by the server at 12:00. At this time, 12:00 is still the last update time of subtask A. As a result, after 12:5, although the subtask A is being executed normally by the server, the database considers that the execution of subtask A has timed out. The timeout will cause the subtask to be executed repeatedly by multiple servers, which will reduce the processing efficiency.

In order to prevent the database from misjudging the subtask execution timeout. In the embodiment of the present application, after receiving the attribute data or abnormal information of the commodity, the database stores the received attribute data or abnormal information on the one hand. In order to realize the storage of attribute data and the recording of abnormal information. On the other hand, the time when attribute data or exception information is received will be updated to the last update time of the subtask. In this way, the execution time of the subtask can be updated.

In addition, as an optional embodiment of the present application, for the convenience of feeding back commodity data to the CP. In this embodiment of the present application, the database may create a task detail (TaskDetail) table for each parent task. And when the abnormal information is received, the abnormal information will be recorded in the task detail table. After the verification of all sub-files of the commodity data is completed, the record of all abnormal information corresponding to the commodity data is completed in the task detail table. According to the task detail table, the CP can clearly know which commodity in the commodity data has an abnormality in the familiar data. And accordingly provide the corresponding attribute data again. To improve the efficiency of commodity management.

As an optional embodiment of processing subtasks in the present application, reference may be made to FIG. 2E for the overall process of processing subtasks. In the embodiment of the present application, multiple servers are used to perform concurrent processing on subtasks. At the same time, in order to prevent a single subtask from being repeatedly processed by multiple servers, distributed locks are also introduced. Details are as follows:

Each server obtains subtasks from the database and simultaneously applies for distributed locks for the subtasks. That is, to grab the lock.

The server that successfully grabs the lock will be the execution body of S104-S107 (because it is a multi-server concurrent processing subtask, so for each subtask, the server that is the execution body of S104-S107 can be the same or different), download it from NSP The subfile corresponding to the subtask.

After downloading the subfile, verify the subfile.

During the verification process, the product image is downloaded and stored in the NSP. At the same time, the attribute data of the commodities in the sub-file will be stored in the database during the verification process. In this way, the concurrent storage and processing of commodity pictures and attribute data is realized.

S108: After storing the attribute data of all the commodities in the sub-file that have passed the verification in the database, the server determines that the execution of the sub-task is completed. And send a state update instruction to the subtask to the database, so as to update the execution state of the subtask in the database to be executed.

When the verification result of a single commodity is obtained (both the verification pass and the non-pass verification are regarded as obtaining the verification result), the embodiment of the present application will determine that the verification of the commodity is completed. After all commodities in the subtask are verified, the server will send a status update instruction to the database to inform the database that the subtask execution is completed. After receiving the status update instruction, the database will update the execution status of the subtask to execution completed. The content specifically included in the state update instruction is not limited here.

Corresponding to the two verification results of passing the verification and failing to pass the verification, in this embodiment of the present application, the completion of the execution of the subtask includes at least two cases:

1. The execution of the subtask is completed, and the attribute data of all commodities in the subtask have been verified.

2. The execution of the subtask is completed, but there are items in the subtask with abnormal attribute data, that is, the attribute data verification of the item fails.

as an optional embodiment of the present application. For subtasks that have been executed and the attribute data of all products have been verified. You can optionally delete this subtask in the database to save database storage space.

As an optional embodiment of the present application, if combined with the embodiment shown in FIG. 2C , a method of applying for a distributed lock to a subtask is used to prevent a single subtask from being executed multiple times. In this embodiment of the present application, after determining that the execution of the current subtask is completed, the server releases the distributed lock on the subtask.

S109, the server continues to send a task query request to the database after completing the subtask.

After the server completes the current subtask, it will continue to process the next subtask. Therefore, it will return to execute S104 at this time, and send the task query request to the database again.

S110: After receiving the task query request, the database identifies the execution status of each subtask in the parent task. If all subtasks in the parent task are completed, it is determined that the storage of commodity data is completed, the storage result is generated, and the storage result is sent to the server.

S111, the server feeds back the storage result to the CP terminal.

If there is unexecuted completion, for example, there is unexecuted or there is execution. Then, the step of S105 is executed at this time.

In the embodiment of the present application, after receiving the task query request, the database identifies the execution status of each subtask under the parent task. Unlike the parent task and the child task when the database is created, at least one server has already executed the child task under the parent task. So for the parent task, there are two possible cases:

1. All subtasks under the parent task are executed and completed.

2. There are still unexecuted subtasks (including unexecuted and executing) under the parent task.

For subtasks that have not yet been executed under the parent task, at this time, the operation of S105 needs to be performed to execute the subtasks to be executed. And perform operations corresponding to S105-S109.

In the case where all subtasks under the parent task are executed and completed, it means that all the commodity data uploaded by the CP have been processed. Among them, for the attribute data of the commodity that has passed the verification, the completion of the processing refers to the completion of the storage of the attribute data. For the attribute data of the products that have not passed the verification, the completion of the processing means that the corresponding abnormal information is recorded in the database. Therefore, at this time, the embodiment of the present application will determine that the current storage of commodity data is completed.

After it is determined that the warehousing is over, the CP needs to be informed of the warehousing situation. Therefore, after the warehousing is completed, the database will determine the actual execution of each subtask under the parent task, and generate the corresponding warehousing result. Among them, the storage situation may include the following:

1. The attribute data of all products in the product data have been successfully stored. At this point, the corresponding warehousing result can be set as the commodity data warehousing success.

2. In the product data, there is an abnormality in the attribute data of the product. The database records the corresponding exception information. At this time, the corresponding warehousing result can be set as the successful warehousing of some commodity data, and the recorded abnormal information can be regarded as part of the warehousing result. When the task detail table is used to record abnormal information, the task detail table will be used as part of the storage result.

After getting the warehousing result, the database sends the warehousing result to the server. The server will feed back the received storage result to the CP terminal. Finally, the CP terminal will display the storage results to the CP for viewing.

Wherein, the operation of S110 may be completed by the database itself, or may be completed by the terminal device where the database is located. For details, refer to the relevant description in S105.

For the case where the attribute data of all products in the product data are successfully stored. At this time, the CP realizes the effective upload of the commodity data. In the commodity data, there is an abnormality in the attribute data of the commodity. At this time, the CP can view the abnormal information in the storage result (if there is a task details table, you can directly view the task details table). The abnormal products can be determined according to the abnormal information, and the attribute data of these abnormal products can be rearranged or checked. These attribute data are then re-uploaded to the NSP as new product data, so as to retry the storage of the product data of the abnormal product.

In this embodiment of the present application, the CP only needs to provide commodity data according to certain format requirements, and the commodity data may be structured or unstructured data. When unstructured commodity data is selected, CP may not perform structured processing on commodity data. After receiving the commodity data, the commodity management system splits the commodity data to obtain multiple sub-files, and creates corresponding sub-tasks for each sub-file. Then, one or more servers are used to perform data verification on each sub-task, and synchronously store the commodity data in the sub-files to the database, and store the corresponding commodity pictures to the network storage platform. This makes warehousing more efficient and realizes efficient management of commodity data. At the same time, the step-by-step warehousing operation of attribute data is the operation of structured warehousing of commodity data. Therefore, regardless of whether the commodity data is structured or unstructured data, the embodiments of the present application can implement the structured storage of commodity data.

In the data management process of the embodiment of the present application, the CP only needs to provide the commodity data according to the format requirements, and can realize the offline import of the commodity data into the database (abbreviated as offline import, offline means that the user does not need to operate online after uploading). The CP may not perform data structuring operations. Since CP originally needs to sort out commodity data in practical applications (whether for the purpose of inventory sorting or listing on e-commerce platforms, CP generally needs to sort out commodity data in practical applications), so for CP, only the commodity data needs to be sorted out. The data can be organized according to the format requirements without too much extra work. Compared with the prior art, the embodiment of the present application greatly reduces the technical threshold of CP operation, and has higher usability. At the same time, the automatic verification and data storage of commodity data also greatly improves the management efficiency of commodity data.

Also, when applied in conjunction with distributed locks. Through the characteristics of distributed locks, when multiple servers are used to process subtasks concurrently, there is no need to worry about multiple servers processing the same subtask at the same time, resulting in inefficient processing of subtasks. Therefore, the embodiments of the present application can implement highly concurrent and efficient processing of subtasks. When the distributed lock is locked for too long, both the server and the cache component will automatically unlock the subtasks. This enables subtasks to be re-applied for locking and processing by other servers, and realizes node management and automatic hosting of subtasks. At this time, it can prevent the server from being unable to process the subtasks normally due to reasons such as failure of the server, so that the subtasks cannot be executed normally for a long time. This makes the processing of subtasks more reliable.

Finally, by utilizing technologies such as commodity data splitting, support for multi-server concurrent processing, and automatic hosting of subtask exceptions, the embodiments of the present application can effectively process large quantities of commodity data. Therefore, the embodiments of the present application can support effective processing of large task scenarios. Through the analysis and feedback of the abnormal information of commodity attribute data, the detailed display of task failure can be realized, which is beneficial to CP to supplement abnormal attribute data in a targeted manner. The operating efficiency of the CP is improved.

As an optional embodiment of the present application, in S107 , the operation of the server performing attribute data verification on the sub-file is to synchronously implement the storage of attribute data and the recording of abnormal information during the verification process. In practical applications, attribute data verification may also be performed on the sub-files first. And after the sub-file verification is completed, the attribute data and abnormal information in the sub-file are stored in the database. Referring to FIG. 3A, at this time S107 can be replaced with:

S201, the server performs attribute data verification on each commodity in the sub-file.

S202, if the verification of the sub-file is completed, store the attribute data of the commodity in the sub-file that has passed the verification to a database. For commodities that fail the verification, the abnormal information of the attribute data of these commodities will be recorded, and the abnormal information will be stored in the database.

For the specific description of the operation principle and details of the data verification of the sub-file, reference may be made to the relevant description of the embodiment shown in FIG. 2A . It will not be repeated here.

It should be noted that, in this embodiment of the present application, the server completes the verification of the attribute data of each commodity in the sub-file. For the commodities that pass the verification, all the attribute data are put into the warehouse. And for the products that have not passed the verification. The exception information of the attribute data will be recorded. And all abnormal information will be sent to the database together. For the description of the abnormal information, reference may be made to the relevant description of S107, which will not be repeated here.

It should be particularly noted that the embodiment of the present application may be applied in combination with the embodiment shown in FIG. 2D . At this time, the embodiment of the present application will first perform the operations of S1071-S10711. And after the sub-file verification is completed, the attribute data and exception information will be stored in the database again.

A comparative analysis is performed on the embodiment shown in FIG. 2A (hereinafter referred to as the embodiment a) and the embodiment shown in FIG. 3A (hereinafter referred to as the embodiment b). For the verification of sub-files, in Embodiment a, commodity attribute data is stored in the warehouse while verifying a single sub-file, and each time the attribute data of a single commodity is used as the object for verification and storage. In the embodiment b, the commodity attribute data is stored in the warehouse only after all the verification of the single sub-file is completed. There are several differences between the two embodiments:

1. Check the precision. In Embodiment a, the server checks and stores the attribute data of a single commodity every time. Therefore, the granularity of each operation is at the level of a single item. In the embodiment b, the server performs the storage of commodity attribute data only after the verification of a single sub-file is completed. So the granularity is at the individual subfile level.

Because a single sub-file often contains more attribute data of products. Therefore, the verification precision of the embodiment a is higher than that of the embodiment b.

2. Time-consuming verification and consumption of network resources. Because a single subfile often contains product attribute data. Therefore, in the verification process of a single sub-file in Embodiment a, the server needs to perform data interaction with the database multiple times. This causes the embodiment a to consume more network resources, and has higher requirements on the quality of the network connection between the server and the database. In addition, multiple data interactions will also increase the time-consuming of sub-file verification. Therefore, compared with the embodiment b, the verification of the embodiment a takes longer time, consumes higher network resources, and has higher requirements on the quality of the network connection.

3. Response to server exceptions. In practical applications, in the process of verifying a single sub-file, the server may experience abnormal situations such as power failure and downtime. At this point the server may abort the verification of the subfile.

For the embodiment a, the server can theoretically synchronize the attribute data verification operation and the warehousing operation at the level of a single commodity. Therefore, in the process of verifying the sub-file, the attribute data or abnormal information of each commodity in the sub-file will also be stored in the database synchronously. On this basis, if the server is abnormal, the database can also record all the verified commodity attribute data in the current sub-file before the server is abnormal. on the basis of. When the other servers re-verify the sub-file, they can choose to start the verification from the beginning, or they can choose to continue to verify the commodity attribute data in the sub-file that has not yet been put into storage.

For example, suppose there are attribute data of 1000 items in subfile a. It is assumed that the server will verify the attribute data of each commodity in turn, and an exception occurs when the 500th commodity is verified (at this time, the attribute data verification of the 500th commodity has not been completed), and the verification cannot be continued. At this time, the attribute data of the first 499 products are all stored in the warehouse. When the other servers verify the sub-file a, they can choose to re-check the attribute data of 1000 commodities, or they can choose to re-check the attribute data from the 500th commodity.

With regard to Embodiment b, if the server cannot continue to verify the current sub-file due to an abnormal situation, the database will be unable to obtain the attribute data in the current sub-file. Therefore, other servers need to re-check the subfile completely.

To sum up, for the abnormal situation of the server, compared with the embodiment b, the embodiment a can theoretically reduce the probability of repeated verification of the sub-files, thereby reducing the workload of verifying the sub-files, and realizing the effective response to the abnormal situation of the server.

Based on the above-mentioned differences, the technical personnel can select Embodiment a or Embodiment B to perform verification and storage of commodity data in combination with actual application requirements. Not too limited here.

For Part 1: Some supplementary explanations on the management operation of commodity data by commodity management system.

(1) The offline import progress of product data can be displayed on the CP.

On the basis of the embodiments shown in FIG. 2A to FIG. 3A , in order to facilitate the CP to know the storage situation of commodity data. In the embodiment of the present application, the server will count the progress of offline import of commodity data, and feed it back to the CP terminal. Details are as follows:

The server sends a progress query request to the database.

After receiving the progress query request, the database obtains the first number of subtasks executed and completed under the parent task, and the second number of all subtasks included under the parent task.

The database generates progress data according to the first quantity and the second quantity, and sends it to the server.

The server sends progress data to the CP terminal.

The CP terminal displays the progress data.

In this embodiment of the present application, the server may send a progress query request to the database. After the database receives the request, it will respond to the request. That is, the number of subtasks executed and completed under the parent task (that is, the first number) and the total number of subtasks (that is, the second number) are obtained. And will generate progress data based on these two quantities. Then send it to the server, and the server sends it to the CP terminal for display.

Wherein, the server may actively query according to certain rules, such as regular query, or periodic query. It can also respond to a query initiated by the CP. In this case, the CP needs to operate in the CP terminal. A query request is sent by the CP terminal to the server. The server then queries the database.

At the same time, the embodiments of the present application do not limit too many ways of representing offline import progress. Therefore, the format and analysis method of the corresponding progress data are not limited here. It can be set by technicians according to actual needs. For example, offline import progress can be characterized as a percentage. At this time, the progress data is the percentage of the number of subtasks executed and completed under the parent task to the total number of subtasks. For example, suppose there are 20 subtasks under the parent task, and 10 subtasks are executed and completed. At this time, the progress data is (10÷20)×100%=50%. For another example, the offline import progress may also be represented by the method of "the number of subtasks executed/total number of subtasks". At this time, the database does not need to process the number of completed subtasks and the total number of subtasks. And the two quantity values can be fed back to the server as progress data. In this case, the CP terminal can display the progress in the form of "number of subtasks executed/total number of subtasks". For example, suppose that there are 20 subtasks under the parent task, of which 10 subtasks are executed. At this time, the CP can display the offline import progress in a "10/20" manner.

The embodiment of the present application realizes the progress feedback of offline import of commodity data, so that the CP can know the progress in time.

(2) The embodiments shown in FIG. 2A to FIG. 3A can also be used for online management of commodity data.

In this embodiment of the present application, the online management of commodity data includes addition, deletion, modification and query of commodity data. Among them, adding, deleting, and modifying refers to adding attribute data of new products, deleting attribute data of existing products, and modifying attribute data of existing products on the basis of commodity data already in storage. Query refers to querying the attribute data of existing products on the basis of commodity data already in storage.

When the number of commodities to be managed by the CP is large, the embodiments shown in FIG. 2A to FIG. 3A may be preferentially used to implement offline import and management of commodity data. At this time, the CP can wait for a period of time after uploading the commodity data to the commodity management system, and then the storage and management of the commodity data can be realized. However, when the commodity data to be managed is relatively small, for example, when only a few or a dozen commodities need to be managed by commodity data storage. On the one hand, the embodiments shown in FIG. 2A to FIG. 3A can be used to implement offline import of commodity data. At this time, the embodiments shown in FIG. 2A to FIG. 3A will process the commodity data, but due to the small number of commodities, it is very likely that only one sub-file will be generated when the commodity data is split at this time (that is, no splitting is required, and the Product data is handled as a subfile). Therefore, the processing steps at this time will be relatively simple. On the other hand, considering the small number of commodities, the operational difficulty of commodity data structuring is relatively low. Therefore, the existing technology can also be used to first perform structured processing of commodity data, and then upload it to the storage bucket.

To sum up, the commodity data storage management method actually adopted by the CP can be selected by the CP according to the needs, and there is no excessive restriction here. However, it should be understood that the embodiments shown in FIG. 2A to FIG. 3A can be applied to full-scenario requirements with a large or small number of commodities at the same time.

As an optional embodiment of the present application, the server provides a callable API to the CP terminal. And supports the software development kit (Software Development Kit, SDK) of Java, PHP, C++, Python and other languages.

Referring to FIG. 3B , the commodity management system provides the CP with commodity management services (ie commodity management implemented by the embodiments shown in FIGS. 2A to 3A ), and provides users with online commodity search services. The CP invokes the server API through the CP terminal to trigger the commodity management service of the commodity management system to complete the online management of commodity data. Wherein, when the commodity data needs to be increased, any one of the embodiments shown in FIG. 2A to FIG. 3A can be used to implement. Wherein, when there is less commodity data, sub-file splitting may not be performed. The verification of each attribute data in the commodity data is directly completed by the server API. And realize operations such as attribute data storage and exception information recording.

For the operations of deletion, modification and query, the CP can call the server API through the CP terminal to inform the server of the commodity to be operated and the specific operation content. After the server is informed of the commodity of the operation and the content of the operation. Then, based on the operation content, operate the commodities in the database. For example, the price of commodity A in the database can be queried to learn the modification, or all attribute data of commodity A in the database can be deleted.

As an optional embodiment of the present application, FIG. 3C is a service scenario interaction diagram of a commodity management system based on the embodiment shown in FIG. 3B . In one aspect of the embodiments of the present application, the CP can operate the CP terminal according to requirements, and use the CP terminal to call the server API to trigger the commodity management service of the commodity management system. After the commodity management service is triggered, the commodity management system will associate commodity data based on the actual operation of the CP terminal, and inform the CP terminal of the management result. For example, the operation results of deleting, modifying and querying commodity data. On the other hand, the user can operate the user terminal as required, and input the product text or image into the product management system through the user terminal to trigger the online search service of the product management system. After receiving the commodity text or commodity picture sent by the user terminal, the commodity management system will perform an online commodity search based on the received commodity text or commodity picture, and return the online search result to the user terminal for the user to view.

(3), the server in the embodiment shown in Figure 2A to Figure 3A is described as follows:

In the embodiment of the present application, there are many operations that require the server as the execution body, and the specific operations include at least the following four groups:

1. Split product data, upload child files to NSP, and create parent tasks and child tasks in the database. Such as S102-S103.

2. Query subtasks, download subfiles, verify and store attribute data. Such as S104, S106, S1061-S1063, S107, S1071-S10711, S108-S109 and S201-S202.

3. Send the storage result to the CP terminal. Including S111.

4. Online management of commodity data. Including supplementary explanation point (2).

In practical applications, there are two optional ways to implement the embodiments shown in FIG. 2A to FIG. 3A :

a. Only one server is used to implement the above four groups of operations.

b. Set up multiple servers to complete the above four groups of operations together.

For the mode a, all the servers in the embodiments shown in FIG. 2A to FIG. 3A are the same server at this time. It is suitable for scenarios with less CP and commodity data. At this point the cost of the server is lower.

For mode b, it can be applied to scenarios with more CP and commodity data. At this time, by using multiple servers to process together, multiple concurrent processing of commodity data can be realized, and the efficiency of commodity data management can be improved. Correspondingly, in the above four groups of operations, the execution subject of each group of operations may be any one of the multiple servers. Wherein, the embodiment of the present application does not limit the manner of determining the specific execution subject of each group of operations too much. It can be set by technicians according to actual needs.

For example, for operation 1: split item data, upload child files to NSP, and create parent and child tasks in database. In some optional embodiments, one server may be selected from the set multiple servers to be responsible for performing operation 1 . At this time, no matter which CP uploads product data, the server will perform product data splitting, sub-file uploading, and task creation processing. In some other optional embodiments, it can also be set as every time a CP uploads commodity data. A server is randomly selected from multiple servers to perform operation 1. For example, a distributed lock mechanism can be introduced. At this time, each server will synchronously apply to the cache component for distributed locks for commodity data. Operation 1 is performed only by the server that grabbed the distributed lock at a time.

For operation 2: query subtasks, download subfiles, verify, and store attribute data. Similar to operation 1, in some optional embodiments, a server may be selected from the plurality of servers set to be responsible for performing operation 2. At this time, no matter which CP uploads product data, the server will perform subtasks for query, subfiles for download, verification, and attribute data storage. In other optional embodiments, it may also be configured that when there are subtasks waiting to be executed, multiple servers synchronously query and process the subtasks. At this point, high concurrent processing of subtasks can be achieved. For example, a distributed lock mechanism can be introduced. At this time, each will apply to the cache component for distributed locks for subtasks synchronously. For a single subtask, it can be executed by the server that grabs the corresponding distributed lock.

For operation 3: send the storage result to the CP terminal. Similar to operation 1, in some optional embodiments, a server may be selected from the plurality of servers set to be responsible for performing operation 3. At this time, the database will send the storage result to the selected server for sending to the CP terminal. In some other optional embodiments, it may also be configured that each time the database obtains a storage result, a server is randomly selected from a plurality of servers, and the storage result is sent to the server.

For operation 4: online management of commodity data. Similar to operation 1, in some optional embodiments, one server may be selected from the set multiple servers to be responsible for performing operation 4. At this time, the CP terminal will inform the server of the commodities to be operated each time, as well as the specific operation contents. The online management of commodity data is realized by the server. In some other optional embodiments, it may also be set to randomly send the data of the CP terminal to one server among the multiple servers each time, and the server implements the online management of the commodity data.

Part 2: The user conducts a product search.

On the basis of realizing the management of commodity data in part one, in order to facilitate users to find commodities, the exposure of commodities is realized. This embodiment of the present application will provide a user with a commodity search function. Users can upload product-related text or pictures to the product management system according to their needs. Commodity search and search result feedback are performed by the commodity management system based on the text or pictures uploaded by the user. Specifically, product search can be divided into two stages: pre-search and in-search, which are detailed as follows:

(1) Before searching.

Before searching, it is first necessary to perform image feature analysis on the product image, and obtain image feature data for image search. In this embodiment of the present application, the operation of image feature analysis may occur in the following two stages:

Stage 1: During the verification process of the sub-files in the embodiment shown in FIG. 2A to FIG. 3A , the image feature analysis is performed on the commodity pictures synchronously.

Stage 2: After the embodiment shown in FIG. 2A to FIG. 3A completes the storage of the commodity data, perform image feature analysis on the commodity pictures stored in the NSP.

In practical applications, technicians can set any one of the above two stages to perform image feature analysis according to requirements.

For example, if it is set to stage 1 to perform image feature analysis. In the embodiments shown in FIG. 2A to FIG. 3A , when the server completes the verification of a single sub-file, it will perform image feature analysis on the corresponding commodity pictures for the commodities that have passed the verification. And the obtained image feature data will be stored in the feature library. However, if it is set to stage 2, image feature analysis is performed. Then, after the verification of the sub-file is completed, the server may perform image feature analysis on the commodity pictures of the commodities that have passed the verification. And the obtained image feature data will be stored in the feature library.

In the operation of image feature analysis, the operations on each product image are the same, and the operations on a single product image include:

S301, the server obtains the image of the product, performs image feature analysis on the image of the product, and stores the obtained image feature data in a feature library.

First of all, it should be noted that the server in this embodiment of the present application may be a server that verifies sub-files. Can also be other servers. The specifics can be set by technical personnel according to requirements, which are not limited here. Correspondingly, depending on the situation of the server and the stage in which the image feature analysis occurs, there may be differences in the way of “acquiring” the product images. For example, when S301 is implemented by a server that performs sub-file verification. Obtaining may refer to the server reading the downloaded product image (refer to S106, at this time, the server has downloaded the product image through the image download address). For the case where the product image is not included locally, the product image needs to be downloaded from the NSP. In this case, the acquisition refers to downloading the product image from the NSP.

In addition, the embodiments of the present application do not limit the specific image feature analysis method too much, which can be determined by technical personnel according to actual needs. For example, some image feature extraction models based on neural networks or deep learning can be pre-trained for image feature analysis. The data type and content of the image feature data need to be determined according to the specific image feature method. For example, it can be an image feature point and a feature vector describing the feature point information, or an image feature vector, such as a 1024-dimensional floating point image feature vector. It can also be other characteristic data.

As an optional embodiment of the present application, it is considered that the characteristics of commodities under each category have certain commonalities. For example, the shapes of commodities in the same category are often similar. Therefore, in order to improve the effect of image feature analysis, the obtained image feature data can better characterize the product. In this embodiment of the present application, different image feature extraction models may be pre-designed for different categories of commodities. When performing image feature analysis, a corresponding model is selected and analyzed according to the actual category of the product (at this time, the product data needs to include the category of the product).

Take an example to illustrate. Suppose the product categories are divided into 10 categories: clothing, digital home appliances, shoes, luggage, home furnishing, toys, beauty, accessories, food and others. At this time, an image feature extraction model can be designed for each of the 10 categories, and 10 corresponding models can be obtained. When performing image feature analysis on a product, the image feature extraction model corresponding to the current product is first determined according to the category in the product data. Then, the image feature extraction model is used to analyze the image feature of the current product to obtain image feature data.

As an optional embodiment of the present application, in order to improve the accuracy of the image search, the embodiment of the present application will simultaneously introduce commodity information as an auxiliary for image feature analysis of commodity pictures. Details are as follows:

In this embodiment of the present application, an image feature analysis model based on a neural network is pre-trained, and commodity data is analyzed based on the image feature analysis model to obtain corresponding image feature data. Among them, for different categories of commodities, corresponding image feature analysis models can be set respectively.

The training process of the image feature analysis model includes:

Preset an initial model.

Feature extraction is performed on commodity information as sample data by using the initial model, and a first loss function for commodity information is calculated according to the extracted text features and corresponding classification labels. The text feature may be a word vector or other text features. The extraction method is not limited here, for example, the product information can be segmented, and the word vector can be obtained by using methods such as text embedding. The text features can be processed and classified by using a fully connected layer, etc., and then the loss function can be calculated based on the classification results and classification labels.

The initial model is used to extract the image features of the product images as sample data, and the second loss function for the product images is calculated according to the image features and the corresponding classification labels. The image features can be processed and classified by using the fully connected layer, and then the loss function can be calculated based on the classification results and classification labels.

In the trained model, each network used for feature extraction of commodity images is extracted, and an image feature analysis model composed of these extracted networks is obtained.

The specific loss function types of the first loss function, the second loss function, and the third loss function are not limited here, and can be set by technical personnel according to requirements. For example, the first loss function may be an image triplet loss function (Image Triplet Loss) or an image class loss function (Image Class Loss), or may be other loss functions. The second loss function may use a text class loss function (Text Class Loss), or may be other loss functions. The third loss function may be a Kullback-Leibler loss function. Other loss functions are also possible.

On the basis of obtaining the image feature analysis model, the embodiment of the present application will use the image feature analysis model to analyze each commodity picture, so as to obtain corresponding image feature data.

As an optional embodiment of the present application, considering that in practical applications, when the CP takes pictures of commodities, there is a high probability that some objects other than commodities will be photographed. At this point, the product image may contain multiple objects. Therefore, if the feature analysis is performed directly on the product image, the obtained image feature data also contains other objects, which is not conducive to subsequent image matching. Therefore, in this embodiment of the present application, commodity detection is performed on commodity pictures before image feature analysis. Referring to Figure 4A, at this time S301 can be replaced with:

S3011 , the server acquires the image of the product, performs product detection on the image of the product, and cuts out the image of the product according to the detection result.

S3012, the server performs image feature analysis on the commodity image, and stores the obtained image feature data in a feature library.

Among them, the embodiment of the present application does not limit the method of commodity detection (in essence, object recognition), which can be set by technical personnel according to actual needs. For example, in some embodiments, some object positioning methods may be used to locate all objects included in the product image. For example, you can use some object positioning algorithms provided by opencv, or you can use SSD algorithms or some deep learning-based object positioning models to achieve object positioning. When taking into account the photographing of merchandise, the merchandise is generally placed in front of the camera of the photographing equipment. Therefore, after locating each object, the object occupying the largest pixel area can be identified as a commodity, and the image in the commodity target frame can be intercepted.

As an optional embodiment of the present application, in consideration of practical applications, the size of the captured commodity image cannot be predetermined. At this time, if the image feature analysis is performed directly on the product image, the situation of the image feature data may be uncontrollable. It is not conducive to subsequent operations such as image feature matching. Therefore, in this embodiment of the present application, the length and width pixels of the product image may be filled before S3012, so that the length and width of the product image are the same. Then scale the product image to a preset size, such as 299×299. Finally, the commodity image obtained by scaling is used as the image feature analysis object of S3012. The image feature data obtained at this time has a relatively controllable amount of data.

As an optional embodiment of the present application, in practical applications, the operation of commodity detection in S3011 can also be performed manually by a CP or a technician. At this time, the CP or the technical staff can manually select the product box in the product picture. And the server performs the interception of the commodity image.

As another optional embodiment of the present application, in consideration of practical applications, the number of commodities is often large. Especially when more CPs use the commodity management system, the number of commodities will increase exponentially. Correspondingly, the amount of image feature data obtained by analyzing the product image features will also increase sharply. This makes the data storage pressure of the signature database relatively large. In order to reduce the data storage pressure of the feature library, the cost of the feature library can be reduced. After the image feature data is obtained in this embodiment of the present application, the image feature data is further compressed. Then, the compressed image feature data is stored in the feature library. Wherein, the embodiment of the present application does not limit the compression method of the image feature data too much, which can be set by technical personnel according to requirements. For example, the precision of the image feature data can be reduced to reduce the data volume.

As an optional embodiment of image feature data compression in this application, the compression method can be any one of the following methods:

1. Principal Component Analysis (PCA) dimensionality reduction method, which reduces the dimension of image feature data from 1024 to 512, 256 or 128-dimensional floating point data.

2. Convert the numerical type in the image feature data from a floating point type to an unsigned integer type (Uint8), and the dimension can remain unchanged.

3. Using the deep hash (Hash) training method, the floating-point image feature data is converted into binary image feature data.

As an optional embodiment of the present application, referring to the offline process part of FIG. 4B , it is a schematic flowchart of a method for performing image feature analysis on commodity pictures before searching. described as follows:

The server obtains product images from product data and adds an API for product image search.

The server performs commodity detection on the commodity image, and intercepts the commodity image according to the detection result.

The server performs image feature analysis on the commodity image to obtain image feature data.

The server performs feature compression on the obtained image feature data, and stores the feature compressed image feature data in a feature library.

For specific operation details, reference may be made to S103, S3011-S3012 and related descriptions of the above-mentioned embodiments of image feature data compression, which will not be repeated here.

(2) Searching.

On the basis of completing the storage of the image feature data of the commodity pictures, the commodity management system in the embodiment of the present application will provide the user with a commodity search function. The product search includes text search and image search. Correspondingly, in order to realize text search and image search. In the embodiment of the present application, the commodity data required to be uploaded by the CP needs to include the commodity picture or the download address of the commodity picture.

During the commodity search process, the user can initiate a commodity search request to the commodity management system through the user terminal, and upload the commodity image or commodity text (ie, commodity-related description text) to be searched. After receiving the commodity image or commodity text to be searched, the commodity management system will search the stored commodity data and determine one or more matching commodities. Then, the attribute data and the product picture of the successfully matched product are returned to the user terminal. displayed by the user terminal.

As an optional embodiment of the present application, for a schematic diagram of a logical architecture of commodity search, reference may be made to FIG. 4C or FIG. 4D . Among them, the commodity management system is responsible for the real-time management of commodity information, including the addition, deletion, modification and query of commodities, as well as offline import of commodity data. Specifically, it includes searching for commodity pictures and commodity texts, and storing commodity data (that is, generating commodity information and storing commodity information).

In FIG. 4C , the user terminal can directly upload the commodity image or commodity text to be searched to the commodity management system. Commodity search and commodity list results are returned by the commodity management system. The right half of FIG. 4C means that the commodity management system can provide commodity data management support for each e-commerce partner. That is, e-commerce partners can upload product data to the product management system as a CP. Inventory management of commodity data is implemented by the commodity management system using the various embodiments shown in FIGS. 2A to 3A .

On the basis of FIG. 4C , one more commodity distribution service layer is set in FIG. 4D . The commodity distribution service layer is mainly connected to the media portal. The user terminal uploads the commodity image or commodity text to be searched to the commodity distribution service through the media portal. The commodity distribution service manages the commodity pictures or commodity texts uploaded by each user terminal in a unified manner, and sends them to the commodity management system to request commodity matching, so as to realize commodity search. It is also responsible for returning the commodity list generated by the commodity management system to the user terminal. Considering that in practical applications, when the number of users is large, the workload of commodity search may be relatively large. Therefore, by adding a docking media portal and managing commodity search requests in a unified manner, the management and response to the user's commodity search can be made more efficient and reliable. In practical applications, the commodity distribution service can be implemented by a dedicated server. It can also be implemented by randomly selecting a server from multiple servers set in the commodity management system. The details can be set by technical personnel according to actual needs, which is not limited here.

The details of text search and image search are as follows:

a. Text search. Referring to Figure 5A, the text search process includes:

S401, the user terminal sends the commodity text input by the user to the server.

The server, which is one of the execution bodies of the embodiment of the present application, is a server in the commodity management system responsible for commodity search. Specifically, the server may be a server pre-selected by a technician in the commodity management system. It may also be a server automatically selected according to certain rules from one or more servers included in the commodity management system. For example, it can be randomly selected. Not too limited here.

In the embodiment of the present application, the user terminal is provided with a commodity search function. When a user needs to search for a product, he or she can enable the product search function, and input the text of the product to be searched or upload a corresponding product image.

Among them, the optional provision methods of the product search function include at least the following:

1. It is integrated into the local search function of the user terminal, such as the common mobile phone negative one-screen search.

2. Set the product search function in the App or webpage. When using the App or webpage, the user can enable the product search function contained in it.

An example is used for illustration, and reference may be made to FIG. 5B . At this time, the user terminal is a mobile phone.

Among them, (a) in FIG. 5B is that the commodity search function is integrated into the local search function of the mobile phone in the form of an input box. The user can turn on this feature when needed. For example, the product search function can be placed on the negative screen of the mobile phone. When the user opens the negative screen, the product search function is enabled, and the corresponding input box is displayed.

At this point, the user can enter the product text in the input box, or upload the product image. After acquiring the commodity text or commodity picture input by the user, the mobile phone will upload the commodity text or commodity picture to the server in the commodity management system.

(b) in FIG. 5B is the integration of the commodity search function in the web page of the mobile phone in the form of an input box. The user can visit the webpage in the mobile phone browser, and enter the product text in the webpage input box, or upload the product image. After acquiring the product text or product image input by the user, the webpage will upload the product text or product image to the server in the product management system.

In the embodiment of the present application, the text search is described by taking the user inputting the commodity text as an example. The product text refers to the description text related to the product. It can be either a paragraph or some keywords. For example, the name, characteristics or brand of the product. In practical applications, the commodity text is the text entered by the user according to the known situation of the searched commodity. Therefore, the actual content of the product text needs to be determined according to the actual application scenario. For example, in some possible scenarios, it may be keywords such as "shorts", "skirt" or "bread", or sentences such as "5G full-screen mobile phone, 50 million quad cameras".

S402, the server performs text matching on the commodity information of each commodity in the database according to the commodity text, and filters out the first commodity information of the top n commodities with the highest text matching degree. where n is a positive integer.

After obtaining the commodity text, in this embodiment of the present application, the server will perform text matching on the commodity information of each commodity in the database based on the commodity text. Then, the matching degree between each commodity and the commodity text is obtained. The embodiments of the present application do not limit the text matching method too much, which can be set by technical personnel according to actual needs. For example, text matching methods based on semantic analysis, such as some neural network-based semantic matching models, can be used. Or use character-based text matching methods, such as Brute Force (BF) algorithm, string matching (Rabin-Karp, RK) algorithm and string search (Knuth-Morris-Pratt, KMP) algorithm.

Considering the practical application, the number of products that are fed back to the user's search should not be too many. Therefore, after calculating the text matching degree between each commodity text and each commodity information, the embodiment of the present application will filter out some commodities with high text matching degree, and use the corresponding commodity information (ie, the first commodity information) as the matching result. Wherein, the specific number n of product information to be screened is not limited here, and can be set by technical personnel. For example, it can be set to any value from 10 to 20, or from 20 to 100.

As an optional embodiment of the present application, it is considered that the number of commodities stored in the database may be extremely large in practical applications. At this time, the workload of direct text matching is relatively large. In order to reduce the workload of text matching and improve the matching efficiency. In the embodiment of the present application, category screening of commodities will be performed first. At this time, S402 can be replaced with: S4021-S4022.

S4021, the server performs category identification of the commodity on the commodity text, and obtains the corresponding first category.

In this embodiment of the present application, the CP will be required to provide category attribute data of the commodity in the commodity data. Correspondingly, the category described for each commodity will be recorded in the commodity information stored in the database at this time. Wherein, the embodiment of the present application does not limit the specific classification rules of the categories too much, which can be preset by the technical personnel and notified to the CP.

After receiving the commodity text, the server will firstly identify the commodity category, that is, determine which category the commodity that the user needs to search for belongs to. The embodiment of the present application does not limit the specific category identification method, which can be set by the technical personnel. For example, a method of keyword matching can be used. That is, some common keywords under each category are set in advance by the technical staff. These keywords can be recorded in the form of a product noun list. For example, suppose the category contains "clothing". At this time, you can set some related keywords such as "clothes", "tops", "pants" and "skirts" under the "clothing" category. After the product text is obtained, keyword search is performed on the product text. The category to which the found keyword belongs is taken as the category corresponding to the commodity text (ie, the first category).

S4022 , the server performs text matching of the commodity information for commodities under the first category in the database according to the commodity text, and filters out the first commodity information of at least one commodity with the highest text matching degree.

After determining the category corresponding to the commodity text, the server will only perform commodity information text matching on the commodities under the category in the database. For example, suppose the category corresponding to the product text is "clothing". At this time, the server will only perform text matching of commodity information on commodities under the category of "clothing" in the database. And get the text matching degree corresponding to these products.

S403, the server acquires attribute data in the first commodity information from the database, and acquires the commodity picture associated with the first commodity information from the NSP. Generate a product list according to the acquired attribute data and product pictures, and send the product list to the user terminal.

Referring to Figure 5A, S403 can be refined into S4031-S4033:

S4031, the server acquires attribute data in the first commodity information from the database.

S4032, the server acquires the commodity picture associated with the first commodity information from the NSP.

S4033, the server generates a commodity list according to the acquired attribute data and commodity pictures, and sends the commodity list to the user terminal.

After the commodity information is filtered out, the embodiment of the present application further downloads attribute data contained in the commodity information from the database, and acquires commodity pictures corresponding to the commodity information from the NSP. And will send the acquired attribute data and product pictures to the user terminal.

It should be noted that the commodity information contains more attribute data of commodities. But in practice, some properties may not be important to the user. For example, suppose that the product information includes the download address of the product image. Because the embodiment of the present application will download the image of the product from the NSP. So the download address is not important to the user. For this reason, in this embodiment of the present application, the downloaded attribute data may be part or all of the attribute data contained in the commodity information. The specific content of the attribute data included can be set by the technical personnel according to the actual needs. For example, it can be set that the attribute data to be downloaded includes: name, price and link of the product. If the product information contains the description of the product, it can also be used as one of the downloaded attribute data. The link may be any one or more of a web page link, an App link, and a quick application link, which is used to jump to a corresponding web page (including an Html5 page), an App page, or a quick application page to display products. In the embodiments of the present application, the web pages, App pages, and quick application pages to which the links point are collectively referred to as commodity display pages.

It is understandable that the embodiment of the present application does not limit the e-commerce platform to which the commodity display page pointed to by the link belongs. In theory, CP can set up the chain home of goods according to its cooperation with different e-commerce platforms. Therefore, in practical applications, the product display page pointed to by the link of the product may be one or more product display pages in different e-commerce platforms. On this basis, users can click on the link according to their actual needs to jump to the product display page of different e-commerce platforms. Alternatively, different link priorities can be preset, and the user terminal can automatically adjust to the commodity display interface pointed to by a link with a higher priority.

For example, it is assumed that CP1 sells commodity A in both the e-commerce platform A and the e-commerce platform B, that is, there are corresponding commodity display pages in the e-commerce platform A and the e-commerce platform B. At the same time, e-commerce platform A and e-commerce platform B both have corresponding websites, apps and quick apps. At this time, the CP can set the corresponding links of the e-commerce platform A and the e-commerce platform B in the website, app and quick application respectively in the product data. That is, at least 6 links can be set in total.

After the attribute data and the commodity pictures are obtained, the embodiment of the present application will sort the attribute data and the commodity pictures in a unit of a single commodity. That is, first sort each product according to certain rules, and then sort the attribute data and product pictures according to the order of the products. After the sorting is completed, the attribute data and the product image of a single product are placed in the same row, and the attribute data and product images of different products are in different rows, so as to obtain a product list composed of the sorted attribute data and product images. Then, the product list is returned to the user terminal as the search result of the product text.

S404, the user terminal displays the commodity list.

After receiving the commodity list, the user terminal displays the commodity list on the screen. Allows users to see the search results of the product text. Wherein, the embodiment of the present application does not limit the display manner of the commodity list too much. It can be set by technicians according to their needs.

As an optional embodiment of the present application, a card may be generated for each commodity in the commodity list, and the attribute data and commodity picture of the commodity in the commodity list may be displayed on the same card. At this time, cards corresponding to each commodity one-to-one can be displayed on the display screen of the user terminal.

Take an example to illustrate. Referring to FIG. 5C, on the basis of the embodiment shown in (a) of FIG. 5B. Suppose the user enters the product text as "wine glass". The search result contains 4 products, each product has three attribute data of product name, price and link, and has a corresponding product picture. At this time, the embodiment of the present application generates a card for each commodity. At the same time, the product image and various attribute data will be displayed in the card.

As an optional embodiment of the present application, if the commodity list includes links to commodities. In this embodiment of the present application, each link is displayed in a card in the form of a control. When detecting the user's click operation on the link, the user terminal jumps to the product display page pointed to by the link.

On the basis of realizing the display of the commodity list, if the commodity list contains the link of the commodity, and the user clicks the link. Then the user terminal will open the product display page corresponding to the link. When the link is a web page link, it means to start the browser and open the web page used for product display. When the link is an App link, it means to start the corresponding App and open the App page for product display from the App. When the link is a quick app link, it means to start the corresponding quick app, and open the quick app page from the app to open the product display.

Take an example to illustrate. Reference may be made to FIG. 5D, on the basis of the example shown in FIG. 4C. It is assumed that the user clicks on the web page link 1 in the first commodity card (refer to (a) in FIG. 5D ). At this time, the user terminal will start the browser and open the website page for commodity display (refer to (b) in FIG. 5D ). At this point, the user can learn about the product details in the opened web page, and can make purchases and other operations.

As another optional embodiment of the present application, in order to meet the cooperation needs of CP and different e-commerce platforms, as well as the user experience. In this embodiment of the present application, the priority between different links may be preset by the technician or the CP. After the user terminal receives the commodity list containing the link, the link itself is not displayed. Reference may be made to (a) in FIG. 5E, on the basis of the embodiment shown in (a) of FIG. 5B. Suppose the user enters the product text as "wine glass". The search result contains 4 products, each product has three attribute data of product name, price and link, and has a corresponding product picture. At this time, the embodiment of the present application generates a card for each commodity. At the same time, the product image and various attribute data except the link will be displayed in the card.

On the basis of realizing the display of the commodity list, if the commodity list contains the link of the commodity, and the user clicks the card corresponding to the commodity. Then the user terminal will open the product display page pointed to by the link with the highest priority. If it fails to open, it will try to open the product display page pointed to by the link with the next highest priority. And so on, until a product display page position is successfully opened.

An example is used to illustrate. Assume that the link priorities preset by the technicians are from high to low: e-commerce platform App links, quick application links and web page links. Referring to (a) in FIG. 5E , it is assumed that the user clicks on the first commodity. At this time, the user terminal will determine whether there is a corresponding link in sequence according to the order of the e-commerce platform App link, the quick application link and the webpage link. That is, if the product has a link to the e-commerce platform App. Referring to (b) in FIG. 5E , at this time, the user terminal will start the e-commerce platform App and jump to the corresponding page. And if the product has only a web page link, you can refer to (c) in FIG. 5E . At this time, the user terminal will start the browser and jump to the corresponding page. Among them, if there is no App, quick application or browser corresponding to the link in the user terminal, the link jump will fail. At this time, in this embodiment of the present application, a link with the next highest priority is reselected for jumping.

In this embodiment of the present application, the user can search for the commodity by inputting the commodity text on the user terminal. And the attribute data of one or more searched commodities can be viewed in the user terminal. You can also view the product display page according to your actual needs. Therefore, the user's product search can be greatly facilitated, and the efficiency of product exposure can be improved.

b. Image search. Referring to Figure 6, the process of image search includes:

S501, the user terminal uploads the image of the product selected by the user to the server.

The operation of S501 is basically the same as that of S401. Therefore, for specific operation details, principles and beneficial effects, reference may be made to the relevant description in S401, which will not be repeated here.

The difference from S401 is:

1. In the embodiment of the present application, the user needs to select a local picture from the user terminal as a product picture to upload to the server. Alternatively, a user terminal may be used to take a photo and upload it to the server as a product image.

2. The function entrance of image search can theoretically be embedded in any function with photo taking or image browsing. For example, in addition to FIG. 5B , the picture search function can also be embedded in the camera function of the user terminal. At this time, the user can directly enable the image search function to query the product corresponding to the photographed object after taking pictures of the object in daily life. Similarly, the image search function can be embedded in the gallery of the user terminal. At this time, while browsing the gallery, the user can enable the image search function as required to query the product corresponding to a certain image in the gallery.

In practical applications, technicians can set the function entry of image search in one or more functions of the user terminal according to requirements. In the embodiment of the present application, by embedding the function entry of image search in different functions, on the one hand, it is convenient for users to use image search, and "photograph shopping" can be realized anytime and anywhere. On the other hand, it can increase the exposure of products and bring more traffic to merchants and e-commerce platforms.

S502, the server performs image feature analysis on the received commodity picture to obtain first image feature data.

In the embodiment of the present application, the method for analyzing the image features of the product pictures uploaded by the user is the same as the image feature analysis method for the uploaded product pictures before the search. Therefore, for the operation of the image feature analysis, reference may be made to the relevant description in S301, which will not be repeated here.

As an optional embodiment of the present application, it is considered that the characteristics of commodities under each category have certain commonalities. For example, the shapes of commodities in the same category are often similar. Therefore, in order to improve the effect of image feature analysis, the obtained image feature data can better characterize the product. In this embodiment of the present application, different image feature extraction models may be pre-designed for different categories of commodities. When performing image feature analysis, category identification (also referred to as intent classification) is first performed on the image of the product to determine the actual category of the product to be searched. Then use this to select the corresponding model and analysis.

In the embodiment of the present application, the category identification of the commodity pictures is essentially the automatic classification of the commodities. Therefore, a corresponding category classification model can be set in advance for each known commodity category. Then use the category classification model to realize the classification and identification of commodity categories. This embodiment of the present application does not limit the model type and architecture of the category classification model too much. It can be set by technicians according to actual needs.

As an optional embodiment of the present application, in the pre-search stage, if S301 uses the image feature analysis model obtained based on multimodal fusion to perform image feature analysis (for details of the image feature analysis model, please refer to the corresponding embodiment description in S301). At this time, in S502, the same image feature analysis model as in S301 is used to perform image feature analysis on the received image of the product, thereby obtaining corresponding image feature data (ie, first image feature data).

It should be understood that the operation of performing commodity detection on commodity pictures may also be applicable to the embodiments of the present application. Therefore, at this time, S3011-S3012 can be applied to the embodiments of the present application. Correspondingly, at this time, S502 can be replaced with:

S5021, the server performs commodity detection on the received commodity image, and intercepts the commodity image according to the detection result.

S5022, the server performs image feature analysis on the commodity image, and obtains first image feature data.

The operations of S5021-S5022 are basically the same as those of S3011-S3012, so the specific operation details, principles and beneficial effects can be referred to the relevant descriptions in S3011-S3012, which will not be repeated here.

S503: The server performs feature matching on the image feature data in the feature library according to the first image feature data. The top n second image feature data with the highest feature matching degree are screened from the feature library, and n commodities corresponding to the top n second image feature data respectively are determined.

After obtaining the image feature data (ie, the first image feature data) of the product image uploaded by the user, the embodiment of the present application uses the image feature data to perform feature matching on the image feature data stored in the feature database. And screen out the top n image feature data (ie, the second image feature data) with the highest feature matching degree. Then, the products corresponding to these image feature data are used as the target products for this search.

The embodiments of the present application do not limit the specific method of feature matching too much, which can be set by technical personnel according to actual needs. For example, some open source search engines can be used to implement feature matching. For example, Faiss can be used, and its principle is to calculate the similarity of image features, and then return the number of products according to the similarity. The number n of image feature data to be specifically screened is not limited here, and can be set by the technical personnel. For example, it can be set to any value from 10 to 20, or from 20 to 100.

S504: Obtain product pictures of n products from the NSP, and obtain attribute data of the n products from the database. Generate a product list according to the acquired attribute data and product pictures, and send the product list to the user terminal.

Referring to Figure 6, S504 can be refined into S5041-S5043:

S5041, the server obtains commodity pictures of n commodities from the NSP.

S5042, the server obtains attribute data of n commodities from the database.

S5043, the server generates a commodity list according to the acquired attribute data and commodity pictures, and sends the commodity list to the user terminal.

After the n commodities found in this search are determined, the embodiment of the present application will download commodity pictures of these commodities from the NSP. At the same time, the attribute data of n items are downloaded from the database. Then, a product list is generated according to the obtained attribute data and product pictures, and sent to the user terminal. Among them, the downloading operation of the attribute data and the generating operation of the commodity list are basically the same as S406. For details, please refer to the relevant description of S406, which will not be repeated here.

As an optional embodiment of the present application, in order to improve the effectiveness of sorting the attribute data of each commodity in the commodity list and the commodity pictures. The product attribute data and product images of products with high similarity should be ranked first as much as possible. Therefore, in this embodiment of the present application, in S504, before the operation of "sending the product list to the user terminal", the server may also reorder the product attribute data and product pictures of each product in the product list (that is, reordering). . For example, considering practical applications, product color is more important for user experience. At this time, according to the color of the product, the product image in the product list with a similar color to the product image uploaded by the user terminal and the attribute data corresponding to the product image can be prioritized in the front of the product list.

As a possible implementation of the reordering of the present application, it includes:

S601, the server performs trademark detection on the commodity picture uploaded by the user terminal, and obtains first trademark information included in the commodity picture.

S602: The server performs trademark detection on each commodity picture in the commodity list, respectively, to obtain second trademark information contained in these commodity pictures.

S603: The server uses the first trademark information to perform information matching on each second trademark information, and sorts the attribute data and product pictures of each commodity in the commodity list according to the order of the information matching degree from high to low.

Wherein, the brand information (including the first brand information and the second brand information) includes at least one of a brand name and a brand pattern. Specific can be set by technical personnel according to actual needs. In addition, the embodiments of the present application do not limit the detection method of trademark information too much, which can be set by technical personnel. For example, it can be an image recognition method based on a neural network model, or it can preset some trademark images for image matching.

In this embodiment of the present application, on the basis of using the image feature data of the product image to perform image feature matching, the trademark information contained in the product image is also used to perform secondary matching. And will re-sort the obtained product list according to the secondary matching results. As a result, a commodity with a high similarity to the commodity to be retrieved by the user can be preferentially displayed in the user terminal with attribute data and commodity pictures.

Another possible implementation of the reordering of the present application includes:

S604, the server performs trademark detection on the commodity picture uploaded by the user terminal, and obtains the first trademark information included in the commodity picture.

S605, the server extracts the third trademark information of n commodities according to the acquired attribute data.

S606, the server uses the first trademark information to perform information matching on each third trademark information, and sorts the attribute data and product pictures of each commodity in the commodity list according to the order of the information matching degree from high to low.

Wherein, the brand information (including the first brand information and the third brand information) is the brand name. On the one hand, in the embodiment of the present application, trademark detection is performed on the commodity picture, and the brand name (ie, the first trademark information) of the trademark contained in the commodity picture is identified. On the other hand, from the attribute data of n commodities, the brand name of each commodity (ie, the third brand information) is searched. A second match is made based on the brand name. And will re-sort the obtained product list according to the secondary matching results. As a result, a commodity with a high similarity to the commodity to be retrieved by the user can be preferentially displayed in the user terminal with attribute data and commodity pictures.

As another possible implementation of the reordering of the present application, it includes:

S607, the server performs trademark detection on the commodity picture uploaded by the user terminal, and obtains the first trademark information included in the commodity picture.

S608: The server performs trademark detection on each commodity picture in the commodity list, respectively, to obtain second trademark information contained in these commodity pictures.

S609, the server extracts the third trademark information of n commodities according to the acquired attribute data.

S6010, the server uses the first trademark information to perform information matching on the second trademark information and the third trademark information of the n commodities, and according to the order of the information matching degree from high to low, compares the attribute data and the attribute data of each commodity in the commodity list with the third trademark information. Product images are sorted.

In the embodiment of the present application, the first trademark information includes the brand name, and on this basis, the trademark pattern may also be included. If the first trademark information only contains the brand name, the second trademark information is the brand name. If the first trademark information contains both the brand name and the trademark pattern, the second trademark information may contain any one or more of the brand name and the trademark pattern. The third trademark information is the trademark name.

On the one hand, in the embodiment of the present application, trademark detection is performed on the product image, and the first trademark information contained in the product image is identified. On the other hand, from the attribute data of the n products, the brand name of each product (ie, the third brand information) is searched, and the second brand information is identified for the product pictures of the n products. The second matching is performed according to the obtained three types of trademark information, and the obtained commodity list is reordered according to the second matching result. As a result, a commodity with a high similarity to the commodity to be retrieved by the user can be preferentially displayed in the user terminal with attribute data and commodity pictures.

The embodiments of the present application do not limit the matching method of trademark information too much, which can be set by technical personnel according to actual needs. For example, in some optional embodiments, each second trademark information may be matched by using the first trademark information on the one hand to obtain the first matching degree corresponding to the n commodities. On the other hand, each third trademark information is matched by using the first trademark information to obtain the second matching degree corresponding to the n commodities. Then, based on the first matching degree and the second matching degree, the final matching degree of each commodity is determined (may be processed by means of weight summation, etc.), and used as the matching result.

As an optional embodiment of the present application, considering that in practical applications, a CP error may occur and the attribute data of the same commodity is repeatedly placed in the same commodity data. For example, some tops that differ only in size are placed in the same product data. At this time, a single product may correspond to multiple attribute data at the same time. For example, the same top, only the size is different. If the product has a higher priority, the attribute data of the same product may be repeated in the product list. At this time, the user experience will be degraded.

In order to cope with the above situation, before sending the commodity list to the user terminal, the embodiment of the present application will perform commodity deduplication on the commodity list. That is, for the same product in the product list, only the attribute data and product image of one product are retained. And delete the attribute data and product information of other products. At this time, the deduplication update of the commodity list can be realized, the effectiveness of the commodity list can be improved, and the user experience can be improved. Correspondingly, what is displayed in S505 is the list of commodities after deduplication and updating.

S505, the user terminal displays the commodity list.

The operation of S505 is basically the same as that of S404. Therefore, for specific operation details, principles and beneficial effects, reference may be made to the relevant description in S404, which will not be repeated here.

As an optional embodiment of the present application, the display of the image search results (that is, the product list), and the way of responding to the user clicking on the link. Referring to FIGS. 5C to 5E , the input data needs to be changed from the commodity text "red wine glass" to a picture of a red wine glass.

As an optional embodiment of the present application, referring to the online process part of FIG. 4B , it is a schematic flowchart of a method for performing image search on a product image uploaded by a user in a search. described as follows:

The user terminal uploads the product image to the server through the API.

The server performs category recognition on the commodity image to obtain the second category.

Based on the second category, the server performs image feature analysis on the commodity image to obtain first image feature data.

The server performs data compression on the first image feature data to obtain compressed first image feature data.

The server performs feature matching on the image feature data in the feature library based on the first image feature data to obtain a product list.

The server reorders the commodity list to obtain the sorted commodity list.

Commodity deduplication is performed on the sorted commodity list, and the commodity list after commodity deduplication operation is sent to the user terminal.

Description of operation details, principles and beneficial effects of each step in the embodiments of the present application. Reference may be made to the relevant descriptions in the embodiment shown in FIG. 6 . It will not be repeated here.

In the embodiment of the present application, the commodity management system has both text search and image search functions. CP stores commodity data once, and can realize various commodity distribution channels. It provides a more method product search function for the e-commerce platform, and has high practical value.

It should be noted that the information exchange, execution process and other contents between the above-mentioned devices/units are based on the same concept as the method embodiments of the present application. For specific functions and technical effects, please refer to the method embodiments section. It is not repeated here.

It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

It is to be understood that, when used in this specification and the appended claims, the term "comprising" indicates the presence of the described feature, integer, step, operation, element and/or component, but does not exclude one or more other The presence or addition of features, integers, steps, operations, elements, components and/or sets thereof.

It will also be understood that, as used in this specification and the appended claims, the term "and/or" refers to and including any and all possible combinations of one or more of the associated listed items.

As used in the specification of this application and the appended claims, the term "if" may be contextually interpreted as "when" or "once" or "in response to determining" or "in response to detecting ". Similarly, the phrases "if it is determined" or "if the [described condition or event] is detected" may be interpreted, depending on the context, to mean "once it is determined" or "in response to the determination" or "once the [described condition or event] is detected. ]" or "in response to detection of the [described condition or event]".

In addition, in the description of the specification of the present application and the appended claims, the terms "first", "second", "third", etc. are only used to distinguish the description, and should not be construed as indicating or implying relative importance. It will also be understood that, although the terms "first," "second," etc. are used in the text to describe various elements in some embodiments of the present application, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first table could be named a second table, and similarly, a second table could be named a first table, without departing from the scope of the various described embodiments. The first table and the second table are both tables, but they are not the same table.

References in this specification to "one embodiment" or "some embodiments" and the like mean that a particular feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in other embodiments," etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless specifically emphasized otherwise. The terms "including", "including", "having" and their variants mean "including but not limited to" unless specifically emphasized otherwise.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.

An embodiment of the present application further provides a server, the server includes at least one memory, at least one processor, and a computer program stored in the at least one memory and executable on the at least one processor, the processing When the computer executes the computer program, the server is made to implement the steps in any of the foregoing method embodiments.

Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps in the foregoing method embodiments can be implemented.

The embodiments of the present application provide a computer program product, when the computer program product runs on a server, the server can implement the steps in each of the above method embodiments when executed.

An embodiment of the present application further provides a chip system, the chip system includes a processor, the processor is coupled to a memory, and the processor executes a computer program stored in the memory, so as to implement the steps in the foregoing method embodiments .

The integrated modules/units, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium. Based on this understanding, the present application can implement all or part of the processes in the methods of the above embodiments, and can also be completed by instructing the relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium, and the computer When the program is executed by the processor, the steps of the foregoing method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form, and the like. The computer-readable storage medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer memory, Read-Only Memory (ROM) ), random access memory (Random Access Memory, RAM), electrical carrier signals, telecommunication signals, and software distribution media, etc.

In the foregoing embodiments, the description of each embodiment has its own emphasis. For parts that are not described or described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.

Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.

The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the above-mentioned embodiments, those of ordinary skill in the art should understand that: it can still be used for the above-mentioned implementations. The technical solutions recorded in the examples are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in the within the scope of protection of this application.

Finally, it should be noted that: the above are only the specific embodiments of the present application, but the protection scope of the present application is not limited to this, and any changes or replacements within the technical scope disclosed in the present application should be included in the present application. within the scope of protection of the application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

A commodity data management method, characterized by being applied to a server, the method comprising:

Obtain commodity data, and split the commodity data into at least one first sub-file, wherein each of the first sub-files contains attribute data of at least one commodity;

Perform attribute data verification on each of the first sub-files, and store the verified attribute data in a database.
The commodity data management method according to claim 1, wherein the performing attribute data verification on each of the first sub-files, and storing the verified attribute data in a database, comprises:

A sub-file is selected from the at least one first sub-file as the second sub-file;

Perform attribute data verification on the second sub-file, and upload the verified attribute data in the second sub-file to the database;

After completing the verification of the second sub-file, return to executing the operation of selecting a sub-file from the at least one first sub-file as the second sub-file, until all the first sub-files are The verification is completed.
The commodity data management method according to any one of claims 1 or 2, characterized in that, further comprising:

If there is attribute data that fails to be verified in the commodity data, the abnormal information of the attribute data that has failed to be verified is acquired, and the abnormal information is stored in the database.
The commodity data management method according to any one of claims 1 to 3, wherein the commodity data is data in a data table format.
The commodity data management method according to any one of claims 2 to 4, wherein the uploading the attribute data that has passed the verification in the second sub-file to the database comprises:

In the process of performing attribute data verification on the second sub-file, uploading the verified attribute data in the second sub-file to the database; or

After the attribute data verification of the second sub-file is completed, the attribute data that has passed the verification in the second sub-file is uploaded to the database.
The commodity data management method according to any one of claims 1 to 5, wherein the attribute data in the commodity data includes a commodity image download address, and the method further comprises:

Download the product image according to the product image download address included in the attribute data that has passed the verification;

Perform image feature analysis on the commodity picture to obtain image feature data;

The image feature data is stored in a feature library.
A commodity search method, characterized in that it is applied to a server, the method comprising:

receiving the first product picture uploaded by the user terminal;

Perform image feature analysis on the first commodity picture to obtain first image feature data;

From the image feature data stored in the feature library, determine at least one second image feature data with the highest feature matching degree with the first image feature data;

Send a second product image corresponding to the at least one second image feature data one-to-one and attribute data associated with the second product image to the user terminal, wherein the sent second product image and The associated attribute data is the second product image and the attribute data sorted based on the trademark information contained in the first product image.
The product search method according to claim 7, characterized in that, in the second product image corresponding to the at least one second image feature data one-to-one, and the attribute data associated with the second product image Before sending to the user terminal, the method further includes:

obtaining the first trademark information contained in the first product image;

Obtain the target trademark information of each target product, the target product is the product associated with the second image feature data, the second product image and the associated attribute data are the product image and attributes of the target product data;

The second product picture and the attribute data of the target product are sorted in descending order of the information matching degree between the target brand information and the first brand information.
The commodity search method according to claim 8, wherein the target trademark information includes: second trademark information and/or third trademark information;

The second trademark information is the trademark information contained in the second product image associated with the target product;

The third brand information is brand information included in the attribute data associated with the target product.
The commodity search method according to any one of claims 7 to 9, wherein the performing image feature analysis on the first commodity picture to obtain the first image feature data, comprising:

Perform image feature analysis on the first product image by using the image feature analysis model that is pre-trained to obtain first image feature data; the image feature analysis model is trained from product image samples and attribute data samples based on multiple product samples In the obtained neural network model, the extracted model.
A commodity data management system, comprising: a first server, a second server and a database;

The first server is configured to obtain commodity data, and split the commodity data into at least one first sub-file, wherein each of the first sub-files contains attribute data of at least one commodity;

The second server is configured to perform attribute data verification on each of the first sub-files, and store the verified attribute data in a database.
The commodity data management system according to claim 11, wherein the performing attribute data verification on each of the first sub-files, and storing the verified attribute data in a database, specifically includes:

The second server selects a sub-file from the at least one first sub-file as the second sub-file;

The second server performs attribute data verification on the second sub-file, and uploads the verified attribute data in the second sub-file to the database;

After completing the verification of the second sub-file, the second server returns to execute the operation of acquiring one sub-file in the at least one first sub-file, until all the first sub-files are deleted. Verification is complete.
The commodity data management system according to claim 12, characterized in that, before selecting a sub-file from the at least one first sub-file as the second sub-file, further comprising:

The first server creates, in the database, a first subtask corresponding to the first subfile one-to-one;

The second server selects one sub-file from the at least one first sub-file as the second sub-file, including:

The second server determines a second subtask from the first subtask stored in the database, and acquires a subfile associated with the second subtask in the at least one first subfile;

The second server returns to perform the operation of acquiring one sub-file in the at least one first sub-file until all the first sub-files are verified, including:

The second server returns to perform the operation of determining a second subtask from the first subtasks stored in the database until all the first subtasks are executed.
The commodity data management system according to claim 13, wherein the operation of determining a second subtask from the first subtask stored in the database comprises:

The second server sends a task query request to the database;

In response to the received task query request, the database filters out subtasks to be executed from the first subtasks, and sends the subtasks to be executed to the second server, and the subtasks to be executed are sent to the second server. The executed subtasks include the unexecuted first subtask, and the first subtask that is being executed and whose execution duration exceeds the duration threshold;

The second server determines the second subtask from the received subtasks to be executed.
The commodity data management system according to claim 14, wherein the second server determines the operation of the second subtask from the received subtasks to be executed, comprising:

The second server sequentially requests the cache component for distributed locks for each of the subtasks to be executed;

When the second server requests a distributed lock for a single subtask to be executed, the subtask to be executed is regarded as the second subtask.
The commodity data management system according to claim 15, wherein in the process of performing attribute data verification on the second sub-file, the second server is further configured to:

judging whether the verification duration of the second sub-file reaches the duration threshold;

If the verification duration of the second subfile reaches the duration threshold, the distributed lock on the second subfile is released.
The commodity data management system according to claim 16, further comprising: the cache component;

The cache component is configured to start timing after allocating the distributed lock to the second sub-file to the second server;

The cache component is further configured to release the distributed lock on the second subfile when the timing duration reaches a duration threshold.
A server, characterized in that the server comprises a memory and a processor, the memory stores a computer program that can run on the processor, and the processor implements the computer program according to claim 1 when the processor executes the computer program. The steps of any one of to 6.
A chip system, characterized in that, the chip system includes a processor, the processor is coupled with a memory, and the processor executes a computer program stored in the memory, so as to realize any one of claims 1 to 6 product data management method.