CN114579590A - Multi-source data processing method, device and system and storage medium - Google Patents

Multi-source data processing method, device and system and storage medium Download PDF

Info

Publication number
CN114579590A
CN114579590A CN202210201955.8A CN202210201955A CN114579590A CN 114579590 A CN114579590 A CN 114579590A CN 202210201955 A CN202210201955 A CN 202210201955A CN 114579590 A CN114579590 A CN 114579590A
Authority
CN
China
Prior art keywords
data
target
processed
target data
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210201955.8A
Other languages
Chinese (zh)
Inventor
陈奕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Kule Information Technology Co ltd
Original Assignee
Suzhou Kule Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Kule Information Technology Co ltd filed Critical Suzhou Kule Information Technology Co ltd
Priority to CN202210201955.8A priority Critical patent/CN114579590A/en
Publication of CN114579590A publication Critical patent/CN114579590A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/12Hotels or restaurants
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/14Travel agencies

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the specification discloses a multi-source data processing method, device and system and a storage medium. The method comprises the following steps: acquiring a plurality of to-be-processed data and a source identifier of each to-be-processed data; performing preprocessing operation on the multiple data to be processed to obtain one or more target data; for each piece of target data, determining a unique identifier based on the data content of the target data and the source identifier; determining whether the target data is valid based on the unique identifier; in response to the target data being valid, performing at least one of the following operations based on the target data: data update and data transmission.

Description

Multi-source data processing method, device and system and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a multi-source data processing method, apparatus, system, and storage medium.
Background
With the rapid development of electronic commerce, more and more enterprises are beginning to provide products and/or services to users by using internet platforms. Currently, in the online travel and/or hospitality industry, a user may request products and/or services desired by the user online, for example, to book a hotel room or travel service. These products and/or services may be provided by various vendors, such as hotels. The supplier can upload the relevant information of the products and/or the services to the internet platform, and the user can pull the data through the access platform to realize the understanding and the comparison of the products and/or the services and then purchase the products and/or the services. In this process, a large amount of data interaction, such as updating of product information, must occur between the platform and the supplier. The accuracy and timeliness of information pushing are needed to be achieved.
Disclosure of Invention
To achieve the above object, one of the embodiments of the present specification provides a multi-source data processing method. The method comprises the following steps: acquiring a plurality of to-be-processed data and a source identifier of each to-be-processed data; performing preprocessing operation on the multiple data to be processed to obtain one or more target data; for each piece of target data, determining a unique identifier based on the data content of the target data and the source identifier; determining whether the target data is valid based on the unique identifier; in response to the target data being valid, performing at least one of the following operations based on the target data: data update and data transmission.
One embodiment of the present description provides a multi-source data processing apparatus. The system comprises an acquisition module, a preprocessing module and an execution module. The acquisition module is used for acquiring a plurality of to-be-processed data and the source identifier of each to-be-processed data. The preprocessing module is used for executing preprocessing operation on the multiple data to be processed to obtain one or more target data. For each piece of target data, the execution module is to: determining a unique identifier based on the data content of the target data and the source identifier; determining whether the target data is valid based on the unique identifier; and in response to the target data being valid, performing at least one of the following operations based on the target data: data update and data transmission.
One of the embodiments of the present description provides a multi-source data processing system that includes a memory, a processor, and a computer program stored on the memory and executable on the processor. Which when executed by the processor implements the steps of the method as described above.
One of the embodiments of the present specification provides a computer-readable storage medium on which a computer program is stored, which, when being executed by a processor, implements the steps of the method as described above.
Additional features will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following and the accompanying drawings or may be learned by production or operation of the examples. The features of the present invention may be realized and obtained by means of the instruments and methods set forth in the detailed description below.
Drawings
The present description will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:
FIG. 1 is a schematic diagram of an application scenario for a multi-source data processing system in accordance with some embodiments of the present description;
FIG. 2 is an exemplary block diagram of a processing device, shown in accordance with some embodiments of the present description;
FIG. 3 is an exemplary flow diagram of a multi-source data processing method according to some embodiments of the present description;
FIG. 4 is an exemplary flow diagram illustrating obtaining data to be processed according to some embodiments of the present description;
FIG. 5 is an exemplary flow diagram illustrating the determination of valid target data according to some embodiments of the present description;
FIG. 6 is an exemplary block diagram of a multi-source data processing system shown in accordance with some embodiments of the present description; and
fig. 7 is an exemplary block diagram of exemplary modules of an execution module shown in some embodiments according to the present description.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present specification, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.
It should be understood that "system", "device", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" are intended to cover only the explicitly identified steps or elements as not constituting an exclusive list and that the method or apparatus may comprise further steps or elements.
Flow charts are used in this description to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to or removed from these processes.
FIG. 1 is a schematic diagram of an application scenario for an exemplary multi-source data processing system, shown in accordance with some embodiments of the present description. In some embodiments, the multi-source data processing system may be used to perform multi-source data-based processing operations, such as data updates, data pushes, and the like. As shown in fig. 1, the application scenario 100 may include a processing device 110, a network 120, a data providing source 130, a data receiving source 140, and a storage device 150.
The processing device 110 may be used to perform one or more of the functions disclosed in this specification. For example, the processing device 110 may obtain multiple copies of the data to be processed and a source identification for each copy of the data to be processed. For another example, the processing device 110 may perform a preprocessing operation on the multiple copies of the data to be processed to obtain one or more target data. For example, the processing device 110 may determine whether each piece of target data is valid and perform data update and/or data transfer operations based on the valid target data. In some embodiments, the processing device 110 may be a stand-alone server or a group of servers. The set of servers may be centralized or distributed (e.g., processing device 110 may be a distributed system). In some embodiments, the processing device 110 may interface directly with the data providing source 130, the data receiving source 140, and the storage device 150 to enable access and/or pushing of information and/or material. In some embodiments, the processing device 110 may execute on a cloud platform. For example, the cloud platform may include one or any combination of a private cloud, a public cloud, a hybrid cloud, a community cloud, a decentralized cloud, an internal cloud, and the like.
In some embodiments, the processing device 110 may include one or more processing engines (e.g., single core processing engines or multi-core processors). By way of example only, the processing device 110 may include one or more combinations of Central Processing Units (CPUs), Application Specific Integrated Circuits (ASICs), application specific instruction set processors (ASIPs), image processors (GPUs), physical arithmetic processing units (PPUs), Digital Signal Processors (DSPs), Field Programmable Gate Arrays (FPGAs), Programmable Logic Devices (PLDs), controllers, micro-controller units, Reduced Instruction Set Computers (RISCs), microprocessors, and the like.
Network 120 may facilitate the exchange of information and/or data. In some embodiments, one or more components of the application scenario 100 (e.g., the processing device 110, the data providing source 130, the data receiving source 140, and the storage device 150) may communicate information to other components of the application scenario 100 over the network 120. For example, the processing device 110 may obtain data to be processed from the data providing source 130 via the network 120 and push the data/information to the data receiving source 140 via the network 120. In some casesIn embodiments, the network 120 may be any form of wired or wireless network, or any combination thereof. By way of example only, network 120 may be a wireline network, a fiber optic network, a telecommunications network, an intranet, the Internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a Public Switched Telephone Network (PSTN), BluetoothTMNetwork purple beeTMOne or more combinations of a network, a Near Field Communication (NFC) network, a Global System for Mobile communications (GSM) network, a Code Division Multiple Access (CDMA) network, a Time Division Multiple Access (TDMA) network, a General Packet Radio Service (GPRS) network, an enhanced data rates for GSM evolution (EDGE) network, a Wideband Code Division Multiple Access (WCDMA) network, a High Speed Downlink Packet Access (HSDPA) network, a Long Term Evolution (LTE) network, a User Datagram Protocol (UDP) network, a Transmission control protocol/Internet protocol (TCP/IP) network, a Short Message Service (SMS) network, a Wireless Application Protocol (WAP) network, an Ultra Wide Band (UWB) network, mobile communications (1G, 2G, 3G, 4G, 5G) network, Wi-Fi, Li-Fi, narrowband Internet of things (NB-Fi), IoT communications, and the like. In some embodiments, network 120 may include one or more network access points. For example, network 120 may include wired or wireless network access points such as base stations and/or internet switching points. Through which one or more components of the application scenario 100 may connect to the network 120 to exchange information and/or data.
The data source 130 may provide a data source for the operation of other components in the application scenario 100. In some embodiments, the data providing source 130 may provide the data to be processed. For example, room-related change information in hotel products/services. In some embodiments, the data providing source 130 may be implemented in the form of a terminal and/or a server. For example, the data providing source 130 may be implemented in multiple servers or single or multiple end devices connected by communication links. The data providing source 130 may also be implemented by a cloud server. For example, the data is uploaded to the cloud and then transmitted to the processing device 110.
The data receiving source 140 may be a source for receiving push information transmitted by the processing device 110. In some embodiments, the data receiving source 140 may also be exposed to the outside after receiving the push information. For example, the data receiving source 140 may be a seller of hotel products/services, such as an internet platform. After receiving the push information, when the user accesses the internet platform and queries related hotel products/information through a corresponding application program (e.g., APP, software program, etc.) installed on the smart terminal (e.g., smart phone, computer, etc.), the data receiving source 140 may present the latest related information to the user.
Storage device 150 may store data and/or instructions. The processing device 110 may execute or use the data and/or instructions to implement the example methods of this specification. In some embodiments, a storage device 150 may be connected to network 120 to enable communication with one or more components in 100 (e.g., processing device 110, data providing source 130, data receiving source 140, etc.). One or more components of the application scenario 100 may access data or instructions stored in the storage device 150 through the network 120. In some embodiments, the storage device 150 may be directly connected or in communication with one or more components of the application scenario 100 (e.g., the processing device 110, the data providing source 130, the data receiving source 140, etc.). In some embodiments, the storage device 150 may be part of the processing device 110. For example, the storage device 150 may act as a local storage device, such as a magnetic disk, for the processing device 110.
In some embodiments, storage device 120 may include mass storage, removable storage, volatile read-write memory, read-only memory (ROM), and the like, or any combination thereof. Exemplary mass storage may include magnetic disks, optical disks, solid state disks, and the like. Exemplary removable memories may include flash drives, floppy disks, optical disks, memory cards, compact disks, magnetic tape, and so forth. Exemplary volatile read-only memory can include Random Access Memory (RAM). Exemplary RAM may include Dynamic RAM (DRAM), double-data-rate synchronous dynamic RAM (DDR SDRAM), Static RAM (SRAM), thyristor RAM (T-RAM), zero-capacitance RAM (Z-RAM), and the like. Exemplary ROMs may include Mask ROM (MROM), Programmable ROM (PROM), erasable programmable ROM (PEROM), Electrically Erasable Programmable ROM (EEPROM), optical disk ROM (CD-ROM), digital versatile disk ROM, and the like. In some embodiments, storage device 120 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an internal cloud, a multi-tiered cloud, and the like, or any combination thereof. For example, some of the algorithms or data disclosed in this specification may be stored on some cloud platform, and updated periodically. The processing device 110 accesses these algorithms or data over the network 120 to achieve a uniform and interactive interaction of the algorithms or data throughout the platform. In particular, some historical data may be stored uniformly on one cloud platform of the platform for access or update by multiple processing devices 110, so as to ensure real-time and cross-platform use of the data.
Fig. 2 is a block diagram of an exemplary processing device shown in accordance with some embodiments of the present description. Processing device 110 may include any components used to implement the systems described in embodiments herein. For example, the processing device 110 may be implemented in hardware, software programs, firmware, or a combination thereof. For convenience, only one processing device is depicted in the figure, but the computing functionality associated with the application scenario 100 described in the embodiments of the present specification may be implemented in a distributed manner by a set of similar platforms to distribute the processing load of the system.
In some embodiments, processing device 110 may include a processor 210, a memory 220, an input/output component 230, and a communication port 240. In some embodiments, processor (e.g., CPU)210 may execute program instructions in the form of one or more processors. In some embodiments, memory 220 includes different forms of program memory and data storage, such as a hard disk, Read Only Memory (ROM), Random Access Memory (RAM), etc., for storing a wide variety of data files that are processed and/or transmitted by a computer. In some embodiments, input/output component 230 may be used to support input/output between processing device 110 and other components. In some embodiments, the communication port 240 may be connected to a network for enabling data communication. An exemplary processing device may include program instructions stored in Read Only Memory (ROM), Random Access Memory (RAM), and/or other types of non-transitory storage media that are executed by processor 210. The methods and/or processes of the embodiments of the present specification can be implemented as program instructions. The processing device 110 may also receive the programs and data disclosed in this specification through network communication.
For ease of understanding, only one processor is exemplarily depicted in fig. 2. However, it should be noted that the processing device 110 in the embodiment of the present specification may include a plurality of processors, and thus, the operations and/or methods described in the embodiment of the present specification and implemented by one processor may also be implemented by a plurality of processors, collectively or independently. For example, if in this specification the processors of processing device 110 perform steps 1 and 2, it should be understood that steps 1 and 2 may also be performed by two different processors of processing device 110, either collectively or independently (e.g., a first processor performing step 1, a second processor performing step 2, or a first and second processor performing steps 1 and 2 collectively).
FIG. 3 is an exemplary flow diagram of a method of multi-source data processing, according to some embodiments of the present description. In some embodiments, flow 300 may be performed by processing device 110. For example, the process 300 may be stored in a storage device (e.g., an on-board storage unit of the processing device 110 such as the memory 220 or an external storage device such as the storage device 120) in the form of a program or instructions, which when extracted and executed, may implement the process 300. In some embodiments, flow 300 may be performed by multi-source data processing system 700. As shown in fig. 3, the process 300 may include the following operations.
In step 310, a plurality of pieces of data to be processed and a source identifier of each piece of data to be processed are obtained. This step may be performed by the acquisition module 610.
In some embodiments, the pending data may include change data of product/service related information. For example, assuming that the product/service is a related product of an online travel/hotel such as a travel service or a hotel room, the pending data may be service content or service price of the travel service, or price of the hotel room or change information of a related bonus product such as breakfast. In some embodiments, the multiple copies of pending data may come from one or more data providers. Such as data providing source 130. The data provider may be a product/service provider that provides a product/service. For example, a hotel supplier providing a lodging room such as a chain hotel group. Taking a hotel as an example, for a hotel, more than one room type is provided, and the services attached to each room type can be different. Thus, the multiple pending data sets may all be from a single hotel, including varying data for different underlying house-type related information. It can also come from multiple hotels, including respective change data for various house type related information of their own house.
In some embodiments, the obtaining module 610 may obtain the multiple copies of the pending data from one or more data queues based on a certain data extraction rule. By way of example, a product in a hotel room may have a description of multiple dimensions, such as price, whether a reservation is discounted, the number of beds, whether there is a window, etc. The change information about the product/service may be a change in the description information of one or more dimensions thereof. In this way, these change information can first be entered into different data queues based on the dimensions of the change. The obtaining module 610 obtains the multiple copies of the data to be processed based on a specific data extraction rule. For a specific description of acquiring the to-be-processed data, reference may be made to the description of fig. 2 in this specification.
In some embodiments, the source identifier may be used to indicate a source or sender of the pending data. The source identification can be any combination of one or more of numbers, letters, characters and the like. In some embodiments, the obtaining module 610 may determine a data provider of the to-be-processed data, and determine a source identifier of the to-be-processed data based on the data provider. By way of example, assume that there are four businesses that offer products/services, including A, B, C and D. The change information about the product/service sent by the four enterprises may be the pending information. The source representation of pending information from enterprise a may be 1, the source representation of pending information from enterprise B may be 2, the source representation of pending information from enterprise C may be 3, and the source representation of pending information from enterprise D may be 4. It should be noted that the above examples are for illustrative purposes and do not limit the source identifiers to be represented using only numbers.
In some embodiments, the pending data may be transmitted by the data providing source 130. The obtaining module 610 may communicate with the data providing source 130 through the network 120 to obtain the data to be processed.
And 320, executing preprocessing operation on the multiple data to be processed to obtain one or more target data. This step may be performed by the pre-processing module 620.
It will be appreciated that the change information sent by the data providing source 130 regarding the product/service does not guarantee that all of the change information is valid. For example, suppose a hotel chain of enterprise-flagged hotels includes a five-star hotel, a four-star hotel, and a flat-rate hotel. It offers products/services on internet platforms that include only five-star hotels. If the product/service change information sent by the hotel chain enterprise is the product/service change information related to the next flat hotel, the change information can be considered invalid. In addition, the formats of the data to be processed from different data sources can be different, and the different formats of the data are converted into a uniform format, which is beneficial to the efficiency of data processing. Therefore, the preprocessing module 620 may perform a preprocessing operation on the multiple pieces of to-be-processed data to remove invalid data and output the to-be-processed data in a uniform format.
In some embodiments, the pre-processing operations may include data filtering and data format conversion. The pre-processing module 620 may determine one or more valid pieces of data to be processed from the plurality of pieces of data to be processed based on at least data content of the data to be processed to implement the data filtering operation.
In some embodiments, the data content of the pending data may include at least a data reception time, a product/service name, a source name, a current product/service status, a current order period, a price, a discount, a product/service specific description, a current status, an associated internet platform, and the like. Taking a hotel product as an example, the data receiving time may be the time when the information related to the hotel product is obtained, the product/service name may be a specific house type of a certain hotel, such as a large bed room of the certain hotel, the source name may be a specific hotel name, the current product/service state may be whether the product/service is available for ordering at the current time, and the price and discount may be the price and the discount available for ordering the product for a certain period of time (for example, ordering one night or two nights of a day during a travel season) and the discount available for enjoyment (for example, ordering nine discount benefits for two nights of two days); the product/service detailed description may include the room size of the large bed room, facilities such as blowers, wardrobes, air conditioners, etc., whether windows are present, etc., and living services such as room cleaning, free breakfast, etc. attached to the ordered room; the current status may include whether the room is reserved, whether the reservation order has been paid, etc.; the associated internet platform may be an internet platform that provides a transaction service for the hotel room type, and the user may order the hotel room type on the associated internet platform.
In some embodiments, one or more items of specific information in the data content may have a content identifier. Continuing with the hotel product example, the source name in the data content may have a hotel identifier, the product/service name may have a house type identifier, and the product/service description may have a product identifier. In conjunction with the source identifier of the to-be-processed data, the pre-processing module 620 may determine a label corresponding to the to-be-processed data based on these identifiers. The different products/services are numbered differently. The pre-processing module 620 may determine whether the product/service can be ordered on the internet platform by determining whether the label is present in a predetermined list of products/services. If the product/service exists in the predetermined product/service list and the product/service can be ordered on one or more internet platforms, the product/service is valid. Otherwise, the product/service is not valid and can be discarded.
In some embodiments, the data format conversion may include converting a data format of the one or more valid pending data. When the valid data to be processed is determined, the preprocessing module 620 may convert the format of the valid data to be processed into a uniform format. This unified format is beneficial to the computation of the physical device 110 (or the multi-source data processing system 700), improving computational efficiency.
In this description, for each target data, multi-source data processing system 700 (e.g., execution module 630) may perform steps 330 through 350.
Step 330, determining a unique identifier based on the data content of the target data and the source identifier. This step may be performed by the determination unit 710 of the execution module 630.
In some embodiments, the data content of the target data may be the data content of the data to be processed that was determined in step 320 to be the target data. Reference may be made to the description in step 320.
In some embodiments, the unique identification may be used to characterize the identity of the product/service. A product/service corresponds to a unique identifier. To determine the unique identification of the target data, the determination unit 710 may determine one or more content identifications indicating the data content of the target data. In some embodiments, the content identification may be an identification for distinguishing homogeneous information in data content of different target data. For example, for a hotel product, the same kind of information may be the name of the hotel, the type of room, a product specific description, and the like. The three items can all have a content identification, and can be one or a combination of several of numbers, letters, symbols and the like. It should be understood that the number of items of homogeneous information in the data content of the target data is not limited to the above. The content identifications corresponding to the different categories of information may be predetermined. For example, hotel room types including standard rooms, large bed rooms, luxury suites, administrative suites, and presidential suites may have different content identifications corresponding to different room types. For example, the content of the standard component is marked as 1, the content of the large bed room is marked as 2, the content of the luxury suite is marked as 3, the content of the administrative suite is marked as 4, and the content of the president suite is marked as 5. When the hotel room type corresponding to the data content of the target data is determined, the content identification of the target data can be determined.
In some embodiments, the determination unit 710 encodes or hashes the one or more content identifications and the source identification to determine the unique identification. Exemplary encoding algorithms may include ASCII, Unicode, UTF8, URL encoding, HTML encoding, Base64, and the like. Exemplary hashing algorithms may include MD5, SHA1, SHA256, SHA512, NTLM.
In some embodiments, the determining unit 710 may combine the source identification and the one or more content identifications to determine the unique identification. As an example, for a hotel product, the one or more content identifiers may include a hotel identifier, a house identifier, and a product identifier. The combination may be such that each logo is separated by a specific character. For example, the representation that can be combined to be a unique identifier can be "source identifier-hotel identifier-house type identifier-product identifier".
In some embodiments, the unique identification may also be encrypted. Exemplary encryption methods may include symmetric encryption such as DES or AES, and asymmetric encryption such as RSA. The encrypted unique identifier can ensure the security of the unique identifier, for example, the security in the transmission process.
Step 340, determining whether the target data is valid based on the unique identifier. This step may be performed by the decision unit 720 of the execution module 630.
In some embodiments, the decision unit 720 may retrieve a product/service in a pre-built product/service database based on the unique identifier that has the same identifier as the unique identifier. If so, the determining unit 720 may compare the data contents contained in the two. If the difference exists, the target data is valid, and if the difference exists, the target data is invalid. Reference may be made to fig. 5 of this specification with respect to determining whether the target data is valid.
In some embodiments, when the target data is determined to be invalid, process 300 may end and the target data may be discarded.
Step 350, based on the target data, performing at least one of the following operations: data update and data transmission. This step may be performed by the execution unit 730 of the execution module 630.
In some embodiments, the data update may include updating data related to the target data in the pre-built product/service database based on the data content of the target data. In this specification, the above-described pre-constructed product/service database may also be referred to as an existing data set. The existing data set stores existing data related to various products/services. When the target data is valid, that is, the existing data set stores existing data related to the product/service corresponding to the target data, and the product/service can be ordered on one or more internet platforms. The executing unit 730 may compare the data content of the target data with the data content of the existing data related to the existing data set, and update a portion of the data content of the existing data, which is different from the data content of the target data. The execution unit 730 may also directly replace the data content of the existing data with the data content of the target data.
In some embodiments, the data transmission may include transmitting data content of the target data to one or more data recipients. For example, the data recipient may be an internet platform. Products/services corresponding to the target data can be ordered on one or more internet platforms. The internet platform needs to maintain the timeliness of the data to ensure that the product/service is properly ordered. The execution unit 730 may transmit the data content of the target data to the one or more internet platforms.
In some embodiments, execution unit 730 may obtain relational mapping data. The relational mapping data may include one or more data recipients associated with a source identification of the target data. Such as the data receiving source 140. The one or more data recipients may be one or more internet platforms for providing product/service orders. As an example, assuming that a particular product/service P provided by enterprise a is available for ordering on internet platform M and internet platform N, the relationship mapping data may be the relationship between product/service P and internet platform M and internet platform N. The relational mapping data may be represented using a mapping table. Such as "product/service P-internet platform M, internet platform N". In some embodiments, upon determining one or more data recipients to which the target data relates, execution unit 730 may transmit the target data or adaptation data for the target data to the one or more data recipients. The adaptation data of the target data may be in a data format accepted by the data receiver. For example, the data formats that the data processing systems of the data receiver 1 and the data receiver 2 are suitable for processing are JSON format and CSV format, respectively, the execution unit 730 may convert the data format of the target data into JSON format and transmit it to the data receiver 1. And meanwhile, converting the data format of the target data into a JSON format and transmitting the JSON format to the data receiver 2.
In some embodiments, the multi-source data processing system 700 (e.g., execution module 730) may also determine whether the data transfer operation was successful. For example, when the data transfer operation is successful, the multi-source data processing system 700 (e.g., execution module 730) may receive feedback from the data recipient. If the feedback is not received, multi-source data processing system 700 (e.g., execution module 730) may determine that the data transfer operation was unsuccessful. The target data can be acquired as the data to be processed again, and the source identification of the target data is acquired. Flow 300 may return to step 310.
It should be noted that the above description of the various steps in fig. 3 is for illustration and description only and does not limit the scope of applicability of the present description. Various modifications and changes to the various steps in fig. 3 will be apparent to those skilled in the art in light of this description. However, such modifications and variations are intended to be within the scope of the present description. For example, each step may be followed by a data storage operation.
The multi-source data processing method disclosed by the specification can be compatible with multi-source data updating information in various types and multiple data formats, realize the relationship management between a data provider and a data receiver and realize more efficient message pushing.
FIG. 4 is an exemplary flow diagram illustrating obtaining data to be processed according to some embodiments of the present description. In some embodiments, flow 400 may be performed by processing device 110. For example, the process 400 may be stored in a storage device (e.g., an on-board storage unit of the processing device 110 such as the memory 220 or an external storage device such as the storage device 120) in the form of a program or instructions, which when extracted and executed, may implement the process 400. In some embodiments, the flow 400 may be performed by the acquisition module 610. As shown in fig. 4, the flow 400 may include the following operations.
At step 410, multiple copies of raw data are obtained.
In some embodiments, the raw data is the same as or similar to the to-be-processed data, also sent by one or more data providers. The one or more data providers (e.g., data providing source 130) may send the multiple copies of raw data to processing device 110 (or multi-source data processing system 700) over network 120.
For each piece of raw data, the raw data is partitioned into one of one or more data queues based on the data type of the raw data, step 420.
In some embodiments, the data type of the raw data may be a data type in which a portion of the data content of the raw data has changed. Taking the description of the relevant part of the process 300 and the hotel products as an example, the data content of the raw data related to the hotel products (e.g., a specific hotel room) can be divided into the following five types, including room status information, static information, product rule information, product price information, and order status information. The room status information may indicate a current status of the hotel room, such as a living room, an empty room, or a reserved room, and the current product/service status may indicate the room status information. The static information may indicate specific facilities contained in the hotel room, such as providing a blower, air conditioning, etc., and the product/service specific description may indicate the static information. The product rule information may indicate discount and price increase information for ordering the room, such as a discount that can be enjoyed for ordering the room for a certain period of time (e.g., travel off-season and high-demand season) and a degree of price increase, and the discount may indicate the product rule information. The product price information may represent a price required to order the room, and the price may indicate the product price information. The order status information may indicate whether the room is ordered, including placed without payment, placed with payment, not ordered, etc. The current status may indicate the order status information. The acquisition module 610 may set up five data queues based on the above five types. The obtaining module 610 may insert the original data into one of the above five data queues according to a data type of a changed portion of the data content of the original data.
In some embodiments, when the changed portion of the data content of the original data relates to two or more data types, the obtaining module 610 may insert the original data into a queue based on random insertion or based on an arrangement order of the initials of the data types.
And 430, acquiring the multiple copies of the data to be processed from the at least one data queue based on a preset data extraction rule.
In some embodiments, the preset data extraction rules may include time-based data extraction rules, order-based data extraction rules, type-based data extraction rules, and the like. The time-based data extraction rule may be based on the acquisition time of the original data, and extract data as the to-be-processed data according to a first-in first-out principle. For example, the original data acquired first is extracted first according to the data reception time from the five queues in the above example. The extracted data may come from different data queues. The order-based data extraction rule may be to extract the multiple copies of data to be processed from each data queue in order. For example, in the five data queues in the above example, a certain amount of raw data is extracted from each data queue as the data to be processed. The number of copies of the original number extracted in each queue may be the same or may be different, such as decreasing in sequence. The type-based data extraction rule may be to extract raw data from a corresponding one of the data queues according to a data type as the data to be processed. In some embodiments, the obtaining module 610 may further obtain the data to be processed from the at least one queue based on a rule extracted randomly.
It should be noted that the above description regarding the steps in fig. 4 is for illustration and explanation only, and does not limit the applicable scope of the present specification. Various modifications and changes to the various steps in fig. 4 will be apparent to those skilled in the art in light of this description. However, such modifications and variations are intended to be within the scope of the present description. For example, each step may be followed by a data storage operation.
FIG. 5 is an exemplary flow diagram illustrating the determination of valid target data according to some embodiments of the present description. In some embodiments, flow 500 may be performed by processing device 110. For example, the process 500 may be stored in a storage device (e.g., an on-board storage unit of the processing device 110 such as the memory 220 or an external storage device such as the storage device 120) in the form of a program or instructions, which when extracted and executed, may implement the process 500. In some embodiments, flow 500 may be performed by multi-source data processing system 700. In some embodiments, flow 500 may be performed by decision unit 720 of execution module 630. As shown in fig. 5, the flow 500 may include the following operations.
Step 510, an existing data set is obtained.
In some embodiments, the existing data set may be the same and/or similar to the pre-constructed product/service database mentioned in the preceding section of this specification. The existing data set includes data related to various products/services that have been provided to the user by various product/service providers. This related data may be referred to as reference data in this specification. One piece of reference data may correspond to a product/service, and the product/service may be available for ordering on one or more internet platforms. In some embodiments, the existing data set may include one or more reference data, each reference data corresponding to a reference identifier. The reference identifier may be considered as a retrieval index of the corresponding reference data, which may be unique for distinguishing other reference data. In some embodiments, the determination of the reference identity may be the same and/or similar to the determination of the unique identity. It will be appreciated that the reference data included in the existing data set is actually data relating to various products/services that the product/service provider has provided to the user, and that the content included therein is indistinguishable from the target data. Only the processing by processing device 110 (or multi-source data processing system 700) occurs at different times in sequence. It is also contemplated that at a prior time, the reference data may also be a data to be processed, which is then stored in the existing data set after the data update operation of step 150 in the process 100.
Step 520, determining the target reference data related to the target data based on the unique identifier and the reference identifier of the one or more reference data.
In some embodiments, the determination unit 720 may compare the unique identifier with a reference identifier of the one or more reference data sets. The reference data corresponding to the reference identifier identical to the unique identifier may be designated as target reference data related to the target data.
Step 530, determining whether the target data is valid based on the data content of the target data and the reference data content of the target reference data.
Based on the above description, the data content of the target data and the specific information-related fields to which the reference data content of the target reference data relates are of the same kind. For example, for hotel room products, the data content of the target data and the reference data content of the target reference data also include information of fields such as room type, room price, discount, etc. The determining unit 720 may compare the information of the same field in the two types of data, and if there is a difference, the target data may be considered valid. Otherwise, the target data may be considered invalid.
It should be noted that the above description regarding the steps in fig. 5 is for illustration and explanation only, and does not limit the applicable scope of the present specification. Various modifications and changes to the steps of fig. 5 will be apparent to those skilled in the art in light of this description. However, such modifications and variations are intended to be within the scope of the present description. For example, each step may be followed by a data storage operation.
FIG. 6 is an exemplary block diagram of a multi-source data processing system, shown in accordance with some embodiments of the present description. As shown in fig. 6, the rate multi-source data processing system 600 may include an acquisition module 610, a pre-processing module 620, and an execution module 630.
The acquisition module 610 may acquire data. In some embodiments, the obtaining module 610 may obtain a plurality of pieces of data to be processed and a source identifier of each piece of data to be processed. The pending data may include change data of information related to the product/service. The obtaining module 610 may obtain the multiple copies of the pending data from one or more data queues based on a certain data extraction rule. In some embodiments, the obtaining module 610 may obtain multiple copies of the raw data. For each piece of raw data, the acquisition module 610 may partition the raw data into one of one or more data queues based on the data type of the raw data. And acquiring the multiple copies of the data to be processed from the at least one data queue based on a preset data extraction rule. The source identifier may be used to indicate a source or sender of the data to be processed. The obtaining module 610 may determine a data provider of the to-be-processed data, and determine a source identifier of the to-be-processed data based on the data provider.
The preprocessing module 620 may perform preprocessing operations on the multiple copies of data to be processed to obtain one or more target data. The preprocessing operations may include data filtering and data format conversion. The pre-processing module 620 may determine one or more valid pieces of data to be processed from the plurality of pieces of data to be processed based on at least data content of the data to be processed to implement the data filtering operation. One or more items of specific information in the data content may have a content identifier. The pre-processing module 620 may determine a label corresponding to the data to be processed based on the identifiers. The preprocessing module 620 can determine whether the data to be processed is valid by determining whether the label exists in a predetermined list of products/services and determining whether the product/service can be ordered on the internet platform. The data format conversion may include converting a data format of the one or more valid pending data. The pre-processing module 620 may convert the format of the valid data to be processed into a unified format.
The execution module 630 may determine the unique identifier based on the data content of the target data and the source identifier. The unique identification may be used to characterize the identity of the product/service. A product/service corresponds to a unique identifier. The execution module 730 may determine the unique identification based on the one or more content identifications and the source identification. In some embodiments, execution module 630 may also determine whether the target data is valid based on the unique identification. The execution module 630 may retrieve a product/service in a pre-built product/service database having the same identity as the unique identity based on the unique identity. If so, the execution module 630 may compare the data contents of the two. If the difference exists, the target data is valid, and if the difference exists, the target data is invalid. In some embodiments, the execution module 630 may perform at least one of the following operations based on the target data: data update and data transmission. The data update may include updating data related to the target data in the pre-constructed product/service database based on the data content of the target data. The data transmission may include transmitting data content of the target data to one or more data recipients. In some embodiments of the present invention, the,
fig. 7 is an exemplary block diagram of an execution module shown in accordance with some embodiments of the present description. As shown in fig. 7, the execution module 630 may include a determination unit 710, a determination unit 720, and an execution unit 730.
The determining unit 710 may determine the unique identifier based on the data content of the target data and the source identifier. The determining unit 710 may determine one or more content identifications indicating data content of the target data. The content identification may be an identification for distinguishing homogeneous information in data contents of different target data. The determining unit 710 encodes or hashes the one or more content identifications and the source identification to determine the unique identification. The determining unit 710 may combine the source identification and the one or more content identifications to determine the unique identification.
Decision unit 720 may determine whether the target data is valid based on the unique identification. In some embodiments, decision unit 720 may obtain an existing data set. The existing data set includes data related to various products/services that have been provided to the user by various product/service providers. The decision unit 720 may determine the target reference data related to the target data based on the unique identifier and the reference identifier of the one or more reference data. The decision unit 720 may compare the unique identifier with a reference identifier of the one or more reference data sets. The reference data corresponding to the reference identifier identical to the unique identifier may be designated as target reference data related to the target data. The determination unit 720 may determine whether the target data is valid based on the data content of the target data and the reference data content of the target reference data. The determining unit 720 may compare the information of the same field in the two types of data, and if there is a difference, the target data may be considered to be valid. Otherwise, the target data may be considered invalid. If the target data is invalid, decision unit 720 may discard the target data.
The execution unit 730 may perform at least one of the following operations based on the target data: data update and data transmission. The executing unit 730 may compare the data content of the target data with the data content of the existing data related to the existing data set, and update a part of the data content of the existing data, which is different from the data content of the target data. The execution unit 730 may also directly replace the data content of the existing data with the data content of the target data. In some embodiments, execution unit 730 may obtain relational mapping data, which may include one or more data recipients related to the source identification of the target data. The execution unit 730 may transmit the target data or adaptation data of the target data to the one or more data recipients.
Additional description of the modules in fig. 6 and 7 may refer to portions of the flow diagrams of this specification, e.g., fig. 1-5.
It should be understood that the systems shown in fig. 6 and 7 and their modules may be implemented in a variety of ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided, for example, on a carrier medium such as a diskette, CD-or DVD-ROM, a programmable memory such as read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules in this specification may be implemented not only by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also by software executed by various types of processors, for example, or by a combination of the above hardware circuits and software (e.g., firmware).
It should be noted that the above descriptions of the candidate item display and determination system and the modules thereof are only for convenience of description, and the description is not limited to the scope of the illustrated embodiments. It will be appreciated by those skilled in the art that, given the teachings of the present system, any combination of modules or sub-system configurations may be used to connect to other modules without departing from such teachings. For example, each module may share one memory module, and each module may have its own memory module. Such variations are within the scope of the present disclosure.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such alterations, modifications, and improvements are intended to be suggested in this specification, and are intended to be within the spirit and scope of the exemplary embodiments of this specification.
Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "one embodiment" or "an embodiment" or "some embodiments" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.
Moreover, those skilled in the art will appreciate that aspects of the present description may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereof. Accordingly, aspects of this description may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present description may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of this specification may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any form of network, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service using, for example, software as a service (SaaS).
Additionally, the order in which the elements and sequences of the process are recited in the specification, the use of alphanumeric characters, or other designations, is not intended to limit the order in which the processes and methods of the specification occur, unless otherwise specified in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the foregoing description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range in some embodiments of the specification are approximations, in specific embodiments, such numerical values are set forth as precisely as possible within the practical range.
For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present disclosure. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.

Claims (13)

1. A multi-source data processing method, the method comprising:
acquiring a plurality of to-be-processed data and a source identifier of each to-be-processed data;
executing preprocessing operation on the multiple data to be processed to obtain one or more target data;
for each of the pieces of the target data,
determining a unique identifier based on the data content of the target data and the source identifier;
determining whether the target data is valid based on the unique identifier;
in response to the target data being valid, performing at least one of the following operations based on the target data: data update and data transmission.
2. The method of claim 1, wherein obtaining the plurality of copies of the data to be processed comprises:
acquiring a plurality of original data;
for each piece of raw data, partitioning the raw data into one of one or more data queues based on a data type of the raw data;
and acquiring the multiple copies of the data to be processed from the at least one data queue based on a preset data extraction rule.
3. The method of claim 1, wherein obtaining the source identifier of each piece of data to be processed comprises:
determining a data provider of the data to be processed;
and determining the source identification of the data to be processed based on the data provider.
4. The method of claim 1, wherein the preprocessing operations include data filtering and data format conversion;
the data filtering includes: determining one or more effective data to be processed from the plurality of data to be processed based on at least the data content of the data to be processed;
the data format conversion includes: and converting the data format of the one or more effective data to be processed to acquire the one or more target data.
5. The method of claim 1, wherein determining the unique identifier comprises:
determining one or more content identifications indicating data content of the target data;
and determining the unique identifier by utilizing a preset algorithm based on the one or more content identifiers and the source identifier.
6. The method of claim 1, wherein the determining whether the target data is valid comprises:
acquiring an existing data set, wherein the existing data set comprises one or more reference data, and each reference data corresponds to a reference identifier;
determining target reference data related to the target data based on the unique identifier and a reference identifier of the one or more reference data;
determining whether the target data is valid based on the data content of the target data and the reference data content of the target reference data.
7. The method of claim 1, wherein the data update comprises:
updating an existing data set associated with the target data based on the data content of the target data.
8. The method of claim 1, wherein the data transmission comprises:
obtaining relational mapping data, wherein the relational mapping data comprise one or more data receivers corresponding to the source identification of the target data;
transmitting the target data or adaptation data of the target data to the one or more data receivers, the adaptation data of the target data having a data format accepted by the data receivers.
9. The method of claim 8, further comprising:
determining whether the data transfer operation was successful;
and in response to the data transmission operation is unsuccessful, re-acquiring the data to be processed corresponding to the target data and the source identifier of the data to be processed.
10. The method of claim 1, further comprising:
discarding the target data in response to the target data being invalid.
11. The multi-source data processing device is characterized in that the system comprises an acquisition module, a preprocessing module and an execution module;
the acquisition module is used for acquiring a plurality of pieces of data to be processed and the source identifier of each piece of data to be processed;
the preprocessing module is used for executing preprocessing operation on the multiple data to be processed to obtain one or more target data;
for each piece of target data, the execution module is to:
determining a unique identifier based on the data content of the target data and the source identifier;
determining whether the target data is valid based on the unique identifier;
in response to the target data being valid, performing at least one of the following operations based on the target data: data update and data transmission.
12. A rate multi-source data processing system, comprising: memory, processor and computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, carries out the steps of the method according to any one of claims 1 to 10.
13. A computer-readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 10.
CN202210201955.8A 2022-03-02 2022-03-02 Multi-source data processing method, device and system and storage medium Pending CN114579590A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210201955.8A CN114579590A (en) 2022-03-02 2022-03-02 Multi-source data processing method, device and system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210201955.8A CN114579590A (en) 2022-03-02 2022-03-02 Multi-source data processing method, device and system and storage medium

Publications (1)

Publication Number Publication Date
CN114579590A true CN114579590A (en) 2022-06-03

Family

ID=81772448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210201955.8A Pending CN114579590A (en) 2022-03-02 2022-03-02 Multi-source data processing method, device and system and storage medium

Country Status (1)

Country Link
CN (1) CN114579590A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733758A (en) * 2018-04-11 2018-11-02 北京三快在线科技有限公司 Hotel's static data method for pushing, device, electronic equipment and readable storage medium storing program for executing
CN112860680A (en) * 2021-03-18 2021-05-28 杭州云灵科技有限公司 Data processing method and system, and data query method and system
CN113515413A (en) * 2021-07-30 2021-10-19 广东电网有限责任公司 Data management method and device, electronic equipment and storage medium
CN113641762A (en) * 2021-08-20 2021-11-12 深圳市四格互联信息技术有限公司 Information pushing method, device and system and computer readable storage device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733758A (en) * 2018-04-11 2018-11-02 北京三快在线科技有限公司 Hotel's static data method for pushing, device, electronic equipment and readable storage medium storing program for executing
CN112860680A (en) * 2021-03-18 2021-05-28 杭州云灵科技有限公司 Data processing method and system, and data query method and system
CN113515413A (en) * 2021-07-30 2021-10-19 广东电网有限责任公司 Data management method and device, electronic equipment and storage medium
CN113641762A (en) * 2021-08-20 2021-11-12 深圳市四格互联信息技术有限公司 Information pushing method, device and system and computer readable storage device

Similar Documents

Publication Publication Date Title
US20200026721A1 (en) Method and system for generating a geocode trie and facilitating reverse geocode lookups
US20200387489A1 (en) Systems and methods for data storage and querying
CN105472045A (en) Database migration method and database migration device
WO2014134474A2 (en) System and method for performing distributed asynchronous calculations in a networked environment
WO2019095670A1 (en) Sales performance tracking method, application server and computer-readable storage medium
US11314895B2 (en) Privacy preserving data collection and analysis
CN110858332A (en) Order production method and device
CN110019200B (en) Index establishing and using method and device
WO2019118867A1 (en) Method, apparatus and computer program product for improving data indexing in a group-based communication platform
CN108351760A (en) Feed service-Engine
WO2015183373A1 (en) Partitioning a database
CN110800001B (en) System and method for data storage and data querying
CN110737655B (en) Method and device for reporting data
CN107426336B (en) Method and device for adjusting push message opening rate
CN110414613B (en) Method, device and equipment for clustering regions and computer readable storage medium
US11501247B1 (en) Optimizing delivery routing using machine learning systems
CN113190517B (en) Data integration method and device, electronic equipment and computer readable medium
CN110895761B (en) After-sales service application information processing method and device
Puspitasari et al. Development of smart parking system using internet of things concept
CN114579590A (en) Multi-source data processing method, device and system and storage medium
CN112598337A (en) Article-oriented vehicle control method, apparatus, device and computer readable medium
US10715619B2 (en) Cache management using a probabilistic data structure
CN115795187A (en) Resource access method, device and equipment
US20140379446A1 (en) Information management device, information management system, information management method, and recording medium
CN108647333A (en) A kind of information sharing method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220603

RJ01 Rejection of invention patent application after publication