CN114723551A - Data processing method, device and equipment based on multiple data sources and storage medium - Google Patents

Data processing method, device and equipment based on multiple data sources and storage medium Download PDF

Info

Publication number
CN114723551A
CN114723551A CN202210468742.1A CN202210468742A CN114723551A CN 114723551 A CN114723551 A CN 114723551A CN 202210468742 A CN202210468742 A CN 202210468742A CN 114723551 A CN114723551 A CN 114723551A
Authority
CN
China
Prior art keywords
data
processed
original
structured
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210468742.1A
Other languages
Chinese (zh)
Inventor
丁平
毛亚妮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202210468742.1A priority Critical patent/CN114723551A/en
Publication of CN114723551A publication Critical patent/CN114723551A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9035Filtering based on additional data, e.g. user or group profiles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Computational Linguistics (AREA)
  • Accounting & Taxation (AREA)
  • Computer Security & Cryptography (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data processing method, a device, equipment and a storage medium based on multiple data sources. The method comprises the following steps: extracting, from at least two data sources, data sets corresponding to the data sources; the data source comprises a plurality of data sets, and the data sets comprise at least one piece of original data; extracting target data in the original data to obtain data to be processed of the original data; verifying the data to be processed of the original data to obtain a data verification result of the data to be processed of the original data; and inputting the data to be processed of each original data into a pre-generated initial structured template according to each data verification result to generate target structured data. The method realizes the processing of the data of multiple data sources into one structural template, generates structural information, reduces the subsequent processing process of using multi-source data, and effectively improves the data processing efficiency.

Description

Data processing method, device and equipment based on multiple data sources and storage medium
Technical Field
The present application relates to data processing technologies, and in particular, to a data processing method, apparatus, device, and storage medium based on multiple data sources.
Background
With the gradual trend of financial field applications to informatization, the business data volume handled by users is more and more. When a user transacts a business, a worker can input data information of a plurality of data sources such as user information, business information and the like, and subsequent information verification and business transaction are facilitated. For example, a worker signs a contract with the user, provides a contract paper, and enters video information of the user and the worker.
In the prior art, data information of each data source such as images, videos and voices needs to be separately stored and recorded into a system disk for storage. When the business needs to be handled, relevant data are called from respective storage paths of the data information, and then separate verification and processing are carried out.
However, the storage space of multi-source data such as pictures, videos, and voices is large, and the data cannot be easily and quickly read. When the business is handled every time, the data needs to be called repeatedly to carry out data check, the structured data output cannot be carried out on the original data quickly, and the processing efficiency of the multi-source data is low.
Disclosure of Invention
The application provides a data processing method, a device, equipment and a storage medium based on multiple data sources, which are used for improving the data processing efficiency of multi-source data.
In one aspect, the present application provides a data processing method based on multiple data sources, including:
extracting, from at least two data sources, data sets corresponding to the data sources; the data source comprises a plurality of data sets, and the data sets comprise at least one piece of original data;
extracting target data in the original data to obtain data to be processed of the original data;
verifying the data to be processed of the original data to obtain a data verification result of the data to be processed of the original data;
and inputting the data to be processed of each original data into a pre-generated initial structured template according to each data verification result to generate target structured data.
In another aspect, the present application provides a data processing apparatus based on multiple data sources, including:
the data set extraction module is used for extracting data sets corresponding to at least two data sources; the data source comprises a plurality of data sets, and the data sets comprise at least one piece of original data;
the to-be-processed data acquisition module is used for extracting target data in the original data to obtain to-be-processed data of the original data;
a data verification result obtaining module, configured to verify to-be-processed data of the raw data to obtain a data verification result of the to-be-processed data of the raw data;
and the target structured data generation module is used for inputting the data to be processed of each original data into a pre-generated initial structured template according to each data verification result so as to generate target structured data.
In another aspect, the present application provides an electronic device comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes the computer-executable instructions stored in the memory to implement a method for data processing based on multiple data sources as described in any of the embodiments of the present application.
In another aspect, the present application provides a computer-readable storage medium having stored therein computer-executable instructions, which when executed by a processor, are used to implement a data processing method based on multiple data sources according to any embodiment of the present application.
In another aspect, the present application provides a computer program product comprising a computer program, which when executed by a processor, implements the multiple data source-based data processing method according to any of the embodiments of the present application.
According to the method and the device, the data sets are obtained from various different data sources, the original data in the data sets are obtained, and the data to be processed are extracted from the original data. The data to be processed of different data sources are verified to obtain a data verification result, and the process that the original data needs to be verified each time is reduced. According to the data verification result, the data to be processed of each data source is input into the same initial structured template to generate target structured data, structured processing of the data of multiple data sources is achieved, a user can directly use the target structured data when transacting business subsequently, and the requirement of acquiring the data in real time is met. The problem that in the prior art, the data in different data sources need to be checked and processed every time the business is handled is solved, and the processing efficiency of multi-source data is effectively improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
Fig. 1 is a flowchart of a data processing method based on multiple data sources according to an embodiment of the present application;
fig. 2 is a flowchart of a data processing method based on multiple data sources according to an embodiment of the present application;
FIG. 3 is a flowchart of a data processing method based on multiple data sources according to an embodiment of the present application;
fig. 4 is a flowchart of a data processing method based on multiple data sources according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a data processing apparatus based on multiple data sources according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
It should be understood that the embodiments described are only a few embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims.
In the description of the present application, it is to be understood that the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not necessarily used to describe a particular order or sequence, nor are they to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate. Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
It should be noted that, for the sake of brevity, this description does not exhaust all alternative embodiments, and it should be understood by those skilled in the art after reading this description that any combination of features may constitute an alternative embodiment as long as the features are not mutually inconsistent. The following examples are described in detail.
Fig. 1 is a flowchart of a data processing method based on multiple data sources according to an embodiment of the present invention, and as shown in fig. 1, the method provided by this embodiment is executed by a data processing apparatus based on multiple data sources. As shown in fig. 1, the method comprises the steps of:
s101, extracting a data set corresponding to at least two data sources; the data source comprises a plurality of data sets, and each data set comprises at least one piece of original data.
The data source may be a preset database or a database server, and in this embodiment, the data that needs to be processed may come from multiple data sources. One or more data sets can be included in a data source, a data set is a collection of raw data, one data set can include one or more raw data, and the raw data in one data set is from the same data source. The raw data is data that the user or staff has uploaded, for example, the raw data may be a photograph of the user's certificate, a photograph of a transaction contract, or a transaction video, etc., and one data set may be a user identity data set that may include a front photograph and a back photograph of the user's certificate.
And packaging the original data into a data set by a user or a worker, and uploading the data set to a corresponding data source for storage. And after the data processing is determined to be needed, acquiring a data set corresponding to the data source from a plurality of data sources. And analyzing each data set to obtain original data in each data set.
In this embodiment, the data set may include at least one of a video data set, a user information picture set, and a service information picture set. The video data set may be a video of the presence of the user and/or the staff member, for example, if the user needs to perform a money transfer service, the staff member is required to upload video information of the presence of the user and the staff member and agree to apply for the service. The user information picture set can be a data set formed by pictures of the certificate for proving the identity of the user, for example, positive and negative photos of the related certificate of the user and the like. The business information picture set can be a data set formed by photos of contracts made by users and staff, for example, the business information picture set can be photos of contract paper pieces and the like.
The video data set, the user information picture set, the service information picture set and the like can be respectively from different data sources, and after the data sets of the data sources are obtained, the original data corresponding to the data sources can be determined.
And S102, extracting target data in the original data to obtain to-be-processed data of the original data.
The target data is extracted from the original data, and the target data is determined as the data to be processed, that is, the data to be processed is extracted from the original data. A data extraction algorithm may be preset, for example, if the original data is a contract copy photo, and the data extraction algorithm is a character extraction algorithm, contract characters in the photo may be extracted, and the extracted contract characters are to-be-processed data.
The original data may include a plurality of data, and in this embodiment, only a part of the data may be extracted from the original data as the target data. For example, the original data is video data, the video data includes multiple frames of picture frames, only the face features in the picture frames where people exist can be extracted, and the extracted face features are used as target data.
The extraction method or the extraction rule of the original data of each data source can be preset, and the extraction method and the extraction rule of the original data of each data source can be the same or different. For example, for original data of a data source one, all numbers in the original data need to be extracted; for the original data of the data source two, the chinese characters in the original data need to be extracted.
After the data to be processed is obtained, the data to be processed may be associated with the raw data, and a source of the data to be processed is determined, that is, the raw data from which the data to be processed is obtained or the data source from which the data to be processed is obtained is determined. The original data may be persisted as a backup, for example, each piece of original data may be marked and the access address of the original data marked for subsequent data acquisition and use.
S103, verifying the to-be-processed data of the original data to obtain a data verification result of the to-be-processed data of the original data.
After the data to be processed is obtained, the data to be processed can be verified, and a data verification result of the data to be processed corresponding to the original data is obtained, so that whether the data to be processed is correct or not is determined. The data verification result may include a verification pass and a verification fail.
The check of the data to be processed can be the format check of the data to be processed, and can also be the content check. For example, the format check may be that the format of the data to be processed is preset to be 18-bit numbers, and then whether the data to be processed is 18-bit numbers can be checked, and if yes, the check is passed; if not, the check fails. The content check may be to check specific content of the data to be processed, and may determine whether there is an error character in the data to be processed, for example, if the data to be processed stored correctly in advance is 001, and if the actually extracted data to be processed is 002, it may be determined that the data to be processed fails to check.
And verifying the data to be processed to obtain a data verification result of the data to be processed, namely obtaining a data verification result of the original data corresponding to the data to be processed.
If the verification of the data to be processed fails, prompt information can be sent to remind a user or a worker to check, so that the efficiency and the precision of data processing are improved, and further the efficiency and the precision of business handling are improved.
And S104, inputting the data to be processed of each original data into a pre-generated initial structured template according to each data verification result to generate target structured data.
And the data to be processed of each original data corresponds to a data verification result. And after obtaining the verification result of each data, determining the data to be processed which passes the verification and the data to be processed which fails the verification. And adding the data to be processed passing the verification to a pre-generated initial structured template to obtain complete structured data serving as target structured data.
The initial structured template is a pre-designed template, the initial structured template may be provided with names of various fields, and specific filling positions of the contents of the various fields may be blank. When the data to be processed is input, the field names of the data to be processed which are successfully verified are determined, and the data to be processed which are successfully verified are filled in the specific filling positions of the corresponding field names. For example, if the data to be processed is the user name, the data to be processed may be filled in at the position of the "user name". Each target structured data can correspond to a unique structured data identification, for example, the identity ID of the user can be used as the structured data identification, or the contract number can be used as the structured data identification.
The data to be processed which fails to be checked may not be input into the initial structured template. Or filling the data to be processed with failed verification in the specific filling position of the corresponding field name, and adding the identifier of failed verification in the data to be processed, thereby improving the generation precision of the target structured data. And can prompt the staff to pay attention to the checking, and improve the efficiency and the precision of service handling.
According to the data processing method and device, the data set is obtained from multiple different data sources, the original data in the data set is obtained, and the data to be processed is extracted from the original data. The data to be processed of different data sources are verified to obtain a data verification result, and the process that the original data needs to be verified each time is reduced. According to the data verification result, the data to be processed of each data source is input into the same initial structured template to generate target structured data, structured processing of the data of multiple data sources is achieved, a user can directly use the target structured data when transacting business subsequently, and the requirement of acquiring the data in real time is met. The problem that in the prior art, the data in different data sources need to be checked and processed every time the business is handled is solved, and the processing efficiency of multi-source data is effectively improved.
Fig. 2 is a flowchart of a data processing method based on multiple data sources according to an embodiment of the present application, which is an alternative embodiment based on the above-mentioned embodiment.
In this embodiment, the target data in the raw data is extracted to obtain the to-be-processed data of the raw data, which can be detailed as follows: extracting target data from original data in a data set corresponding to a data source according to a preset incidence relation, and determining the target data as to-be-processed data of the original data; the preset association relationship is an association relationship between the data source and the data extraction rule.
As shown in fig. 2, the method comprises the steps of:
s201, extracting a data set corresponding to at least two data sources; the data source comprises a plurality of data sets, and each data set comprises at least one piece of original data.
S202, extracting target data from original data in a data set corresponding to a data source according to a preset incidence relation, and determining the target data as to-be-processed data of the original data; the preset association relationship is an association relationship between the data source and the data extraction rule.
For example, the original data of the first data source is a video, the original data of the second data source is a picture, and the way of extracting the data to be processed from the video is different from the way of extracting the data to be processed from the picture.
The incidence relation between a data source and a data extraction rule is preset, and the data extraction rule refers to a mode of extracting data to be processed from original data. And determining a mode of extracting the data to be processed from the original data and extracting according to a preset incidence relation to obtain target data, wherein the target data is the data to be processed.
By presetting the incidence relation, the data to be processed can be extracted aiming at different original data, so that data extraction errors are avoided, and the precision and the efficiency of data processing are improved.
In this embodiment, extracting target data from original data in a data set corresponding to a data source according to a preset association relationship includes: determining a target data extraction rule of a data set corresponding to any data source according to the association relationship between the preset data source and the data extraction rule; and obtaining target data from the original data according to the target data extraction rule.
Specifically, the association relationship between the data source and the data extraction rule is preset, and the data extraction rules associated with different data sources may be the same or different. And determining a target data extraction rule associated with the data source corresponding to the original data according to a preset association relation.
When extracting the data to be processed from each original data, a data set corresponding to the original data may be determined first, and then a data source corresponding to the data set is determined, that is, a data source corresponding to the original data is obtained. According to the association relationship between the preset data source and the data extraction rule, the data extraction rule associated with the data source corresponding to the original data can be determined. And taking the data extraction rule as a target data extraction rule, namely obtaining the target data extraction rule corresponding to the original data. The target data extraction rule can be used for extracting target data, and the target data is to-be-processed data.
After the target data extraction rule is obtained, extracting target data from original data in a data set corresponding to the data source according to the target data extraction rule, and determining the target data as to-be-processed data of the original data. For example, if the original data is a picture, and the target data extraction rule is to extract face information in the original data, the face information in the picture is extracted as target data, so as to obtain data to be processed.
The method has the advantages that the target data extraction rule is determined according to the incidence relation, the target data is determined from the original data for extraction, and data extraction errors can be avoided. The data extraction of different data sources is realized, the flexibility and the precision of data processing are improved, the scene requirements of different services are met, and the efficiency and the precision of data processing are improved.
In this embodiment, the data set is a video data set, and the original data is an original video in the video data set; correspondingly, according to the target data extraction rule, obtaining target data from the original data comprises: according to a preset video analysis algorithm, extracting the features of the picture in the original video to obtain a video feature value; and taking the video characteristic value as scene information, constructing a structured scene information set of the video data set, and determining the structured scene information set as target data.
Specifically, the data set may be a video data set, the video data set may include one or more videos, the original data in the video data set is a video, and the original data in the video data set is determined to be an original video. If the data set in the data source is a video data set, it may be preset that the target data extraction rule associated with the data source is a video analysis algorithm. The video analysis algorithm may be a face analysis algorithm, a scene analysis algorithm, or the like, for example, the video analysis algorithm may determine that a video scene is indoors or outdoors, and may also determine the number of people in the video, or the like.
When extracting data to be processed from an original video in the video data set, the original video may be decomposed into multiple frames of video frames. And according to a preset video analysis algorithm, extracting the features of the pictures of the video frames, and determining the extracted feature values as video feature values. In this embodiment, the specific extracted features may be determined by a video analysis algorithm. For example, the video feature value may be a scene feature value, a face feature value, or the like.
And determining the extracted video characteristic values as scene information of the original video, constructing a structured scene information set of the video data set according to the scene information, and determining the structured scene information set as target data. For example, the structured scene information set is named as a set union, and each piece of scene information extracted from the original video is taken as an element of the set union, so that the structured scene information set is obtained.
The method has the advantages that the video characteristic value can be obtained from the dynamic original video according to the video analysis algorithm, and the service requirement under the video scene is met. And the video characteristic value is used as scene information to construct a structured scene information set, so that the structured processing of the video data is realized, and the efficiency of subsequently generating the target structured data is improved.
In this embodiment, according to a preset video analysis algorithm, performing feature extraction on a picture in an original video to obtain a video feature value, including: determining the number of people in the original video according to a preset video people analysis algorithm; determining a target service scene identifier of the video data set according to the incidence relation between the number of the characters and the service scene identifier; according to a preset character feature extraction algorithm, feature extraction is carried out on any character in the original video to obtain a character feature value; and determining the target service scene identification and the character characteristic value as video characteristic values.
Specifically, when feature extraction is performed on a picture in an original video according to a video analysis algorithm, the number of people and facial features in the picture can be extracted. For example, a video people analysis algorithm is preset and can be used to determine the number of people in the original video. For example, when a user transacts a large volume of business, the user's own person and staff are required to be present, i.e., the number of people in the original video should be two.
The method comprises the steps of presetting an incidence relation between the number of characters and service scene identifications, wherein the service scene identifications are used for representing different service scenes, and each service scene can be represented by a unique service scene identification. For example, the service scenario of service one is labeled 01, requiring two people to be present; the service scene of service two is identified as 02, and one person is needed to be present. According to the determined number of the characters and the preset incidence relation between the number of the characters and the service scene identification, the service scene corresponding to the original video in the video data set, namely the target service scene, can be determined, and the target service scene identification of the target service scene can be determined. For example, if it is determined that the number of people in the original video is two, it may be determined that the target service scene of the original video is service one, and the target service scene identifier is 01.
Besides the number of people in the original video, the characteristics of people in the original video can be extracted. A character feature extraction algorithm can be preset, and feature extraction is carried out on each character in the original video according to the preset character feature extraction algorithm to obtain character feature values. The character features may be appearance features such as face features, height features, and scale features.
All information extracted from the original video is used as video characteristic values, that is, the video characteristic values may include the number of people, the target service scene, the identification of the target service scene, the character characteristic values, and the like.
The method has the advantages that the same original video can be comprehensively subjected to feature extraction according to different feature extraction algorithms, so that feature omission is avoided, and the data processing precision is improved.
In this embodiment, the constructing a structured scene information set of a video data set by using the video feature value as scene information includes: acquiring an initial scene information set of a preset video data set; and inputting the target service scene identification and the character characteristic value into the initial scene information set to obtain a structured scene information set of the video data set.
Specifically, an initial scene information set of the video data set is preset, and the initial scene information set may be a preset format of a structured scene information set of the video data set. After the video characteristic values are obtained, the video characteristic values can be filled in the initial scene information set according to a preset format, and a complete structured scene information set is obtained.
For example, the video feature value may include a target service scene identifier and a person feature value, and the initial scene information set may be preset as video _ c _ face { video _ c _ face1 (person 1 feature value within video), video _ c _ face2 (person 2 feature value within video), video _ c _ face3 (person 3 feature value within video), video _ c _ face4 (person 4 feature value within video), and video _ c _ tag (target service scene identifier) }. After the target service scene identification and the character characteristic value are obtained, the video characteristic values such as the target service scene identification and the character characteristic value are used as video tags and are filled in corresponding positions in the initial scene set, and a structured scene information set is obtained.
The method has the advantages that the initial scene information set is preset, the structured scene information set can be obtained quickly, structured processing of the original video is achieved, and data processing efficiency is effectively improved.
In this embodiment, obtaining the target data from the original data according to the target data extraction rule further includes: acquiring voice in an original video, converting the voice in the original video into characters according to a preset voice recognition algorithm, and generating a voice text; acquiring voice keywords from a voice text according to a preset keyword extraction algorithm; and constructing a structured keyword information set of the video data set according to the voice keywords to serve as target data.
Specifically, for an original video in the video data set, a video feature value of a picture may be acquired, and also voice information in the video may be acquired. And acquiring the voice in the original video according to a preset voice acquisition algorithm. The predetermined Speech Recognition algorithm may be, for example, an ASR (Automatic Speech Recognition) algorithm. And automatically converting the voice of the video into characters by adopting a voice recognition algorithm to generate a voice text.
And presetting a word segmentation algorithm and a keyword extraction algorithm. And performing word segmentation on the voice text according to a word segmentation algorithm to obtain words in the voice text. And determining the voice keywords from the words obtained by word segmentation by adopting a keyword extraction algorithm. And according to the obtained voice keywords, constructing a structured keyword information set of the video data set, and determining the structured keyword information set as target data. An initial keyword information set can be preset, and after the voice keywords are obtained, the voice keywords are filled into the initial keyword information set to obtain a complete structured keyword information set. For example, the initial keyword information set is video _ c _ text ═ { text, keyword 1, keyword 2, and keyword 3 }.
The beneficial effects of setting up like this lie in, obtain and convert the text to the pronunciation in the original video, avoid omitting to the sound information, realize the comprehensive acquisition to information in the video, improve the precision of data processing, and then improve efficiency and the precision that follow-up business was handled.
In this embodiment, the data set is a user information picture set and/or a service information picture set, and the original data is a user information picture and/or a service information picture; correspondingly, according to the target data extraction rule, obtaining target data from the original data, including: obtaining user information in the user information picture and/or service information in the service information picture through a preset character recognition algorithm; constructing a structured user information set of a user information picture set according to user information, and using the structured user information set as target data; and/or constructing a structured service information set of the service information picture set according to the service information, and using the structured service information set as target data.
Specifically, the data set may further include a user information picture set and/or a service information picture set, and the like, where the original data in the user information picture set is a user information picture, and the original data in the service information picture set is a service information picture. The user information picture can be a photo of various certificates of the user, and the service information picture can be a photo of a contract signed by the user.
A Character Recognition algorithm may be set in advance, and for example, may be an OCR (Optical Character Recognition) algorithm. And identifying the characters in the picture information according to a character identification algorithm to obtain the user information in the user information picture and the service information in the service information picture. For example, information such as age and name on a certificate can be obtained by character recognition, and information such as name and time on a contract can also be obtained.
And a face recognition algorithm or other image recognition algorithms can be set for recognizing faces or other images in the user information picture and the service information picture.
After the user information is obtained, a structured user information set of the user information picture set can be constructed and used as target data. After the service information is obtained, a structured service information set of the service information picture set can be constructed as target data.
An initial user information set of the user information picture set and an initial service information set of the service information picture set may be preset. And filling the user information into the initial user information set to obtain a complete structured user information set, and filling the service information into the initial service information set to obtain a complete structured service information set. For example, the structured user information set may be obtained as image _ uc _ json { "user name": "zhang san", "user certificate number": "1231341234", "user account": "342394917843"}.
The method has the advantages that the characters in the picture are identified according to the character identification algorithm, and the extraction efficiency of the information in the picture is improved. And generating a structured user information set and a structured service information set according to the extracted user information and service information, realizing the structured processing of the data and improving the data processing efficiency.
S203, verifying the to-be-processed data of the original data to obtain a data verification result of the to-be-processed data of the original data.
And S204, inputting the data to be processed of each original data into a pre-generated initial structured template according to each data verification result to generate target structured data.
According to the data processing method and device, the data set is obtained from multiple different data sources, the original data in the data set is obtained, and the data to be processed is extracted from the original data. The data to be processed of different data sources are verified to obtain a data verification result, and the process that the original data needs to be verified each time is reduced. According to the data verification result, the data to be processed of each data source is input into the same initial structured template to generate target structured data, structured processing of the data of multiple data sources is achieved, a user can directly use the target structured data when transacting business subsequently, and the requirement of acquiring the data in real time is met. The problem that in the prior art, the data in different data sources need to be checked and processed every time the business is handled is solved, and the processing efficiency of multi-source data is effectively improved.
Fig. 3 is a flowchart of a data processing method based on multiple data sources according to an embodiment of the present application, which is an alternative embodiment based on the above-mentioned embodiment.
In this embodiment, the data to be processed of the original data is verified to obtain a data verification result of the data to be processed of the original data, which can be detailed as follows: and verifying the to-be-processed data of the original data according to a preset data verification rule to obtain a data verification result of the to-be-processed data.
As shown in fig. 3, the method comprises the steps of:
s301, extracting a data set corresponding to the data source from at least two data sources; the data source comprises a plurality of data sets, and each data set comprises at least one original data.
S302, extracting target data in the original data to obtain to-be-processed data of the original data.
And S303, verifying the to-be-processed data of the original data according to a preset data verification rule to obtain a data verification result of the to-be-processed data.
The data verification rules are preset and used for verifying the data to be processed, and the data verification rules of the data to be processed in different data sources can be the same or different. And verifying the data to be processed according to the data verification rule to obtain a data verification result. For example, a data verification rule of to-be-processed data of original data corresponding to each data set under a data source is preset. After the data to be processed is obtained, determining the original data of the data to be processed, and obtaining a data set and a data source corresponding to the data to be processed according to the original data. And determining a target data verification rule associated with the data to be processed according to each preset data verification rule. And verifying the data to be processed according to the target data verification rule to obtain a verification result of passing or failing the verification.
The format, content, and the like of the data to be processed may be set in the data check rule, and for example, the character length, the character range, and the like of the data to be processed may be specified. If the length of the data to be processed is not the character length of the data verification rule, determining that the data to be processed fails to be verified; and if the to-be-processed data comprises characters outside the specified character range, determining that the to-be-processed data fails to be checked. For example, the data to be processed is specified in the data verification rule to be composed of upper and lower case english letters, and if the actually obtained data to be processed has a number, it is determined that the data to be processed fails to be verified.
The data verification rule can be modified according to actual requirements, data verification is carried out according to the data verification rule, the requirements of various service scenes can be met, different data to be processed are subjected to targeted verification, and the precision and flexibility of data verification are improved.
In this embodiment, according to a preset data verification rule, verifying to-be-processed data of original data to obtain a data verification result of the to-be-processed data, including: comparing the data to be processed with pre-stored database information; and if the data to be processed is consistent with the database information, determining that the data verification result is verification passing.
Specifically, before transacting various services, the user needs to perform identity registration, and the information registered by the user can be stored, and the information of the staff can be stored in advance. For example, information such as names, ages, and face images of users and workers may be stored in a database, and the information stored in advance may be used as database information.
The data checking rule can be that the data to be processed is checked in a networking way, and the data to be processed is compared with the pre-stored database information. According to the data to be processed, whether consistent database information exists or not is searched from a database, if yes, the information to be processed is determined to be correct, and the data verification result is that the data are verified to be passed; and if the data verification result does not exist, determining that the data verification result is verification failure.
For example, the information of the name and age of the worker in the information to be processed may be compared with the database information to determine whether the identity of the worker is real. And if the information of the worker is inconsistent with the information of the database, the worker is considered to have the possibility of falsely using the identity, the data verification result is verification failure, and prompt information can be sent to remind the user or the worker to verify. For another example, the face information of the user who wants to transact the business may be compared with the face information in the database to determine whether the user is an actual business transactor. And if the face information in the database information is consistent with the face information of the user, determining that the data verification result is verification passing. And carrying out correlation check on user information such as face information, name and the like, and determining that the data verification result is verification pass if each piece of information of the user is correct. For example, if the name of the user is "zhangsan", the identity is "00001", and "zhangsan" and "00001" exist in the database, but the name of "00001" corresponding to the database is "liquad", it is determined that the data verification result is verification failure.
The beneficial effect who sets up like this lies in, through carrying out the comparison with database information, can realize automatic networking inspection, avoids the inspection result mistake to appear, improves the precision of data check, and then improves data processing's efficiency and precision.
In this embodiment, the data to be processed is a structured scene information set and a structured user information set; correspondingly, after the data to be processed is consistent with the database information, the method further comprises the following steps: acquiring character characteristic values of users in the structured scene information set and user information in the structured user information set; searching a face characteristic value of a user from pre-stored database information according to the user information; and if the face characteristic value is consistent with the character characteristic value, determining that the data verification result is verification pass.
Specifically, for the data to be processed of the structured scene information set and the structured user information set, the structured scene information set and the structured user information set may be subjected to correlation check. The structured scene information set can be obtained from video feature values, i.e. from pictures in the original video. The original video has a person image of the user, for example, a face image. The structured user information set may be derived from user information, i.e. from a picture of user information. The user information image may include a person image such as a face image.
When the structured scene information set and the structured user information set are checked, a video characteristic value of a person image of a user, namely a person characteristic value, can be obtained from the structured scene information set, and the person characteristic value can be a characteristic value of a face of the user in an original video. And acquiring user information from the structured user information set, wherein the acquired user information can comprise a person image, a name, an identity mark and the like.
Various information of the user is stored in the database in advance, and according to the user information, the database information corresponding to the user information can be searched from the database, for example, the face feature value of the user can be found. Comparing the face characteristic value in the database with the character characteristic value of the user in the structured scene information set, and if the face characteristic value is consistent with the character characteristic value of the user in the structured scene information set, determining that the structured scene information set and the structured user information set pass the verification; and if the two are not consistent, determining that the verification of the structured scene information set and the structured user information set fails.
The method has the advantages that different data to be processed can be subjected to correlation verification, the data to be processed are prevented from being not corresponding, the consistency of user related information in the data to be processed is ensured, and the data verification and data processing precision is effectively improved.
And S304, inputting the data to be processed of each original data into a pre-generated initial structured template according to each data verification result to generate target structured data.
According to the data processing method and device, the data set is obtained from multiple different data sources, the original data in the data set is obtained, and the data to be processed is extracted from the original data. The data to be processed of different data sources are verified to obtain a data verification result, and the process that the original data needs to be verified each time is reduced. According to the data verification result, the data to be processed of each data source is input into the same initial structured template to generate target structured data, structured processing of the data of multiple data sources is achieved, a user can directly use the target structured data when transacting business subsequently, and the requirement of acquiring the data in real time is met. The problem that in the prior art, the data in different data sources need to be checked and processed every time the business is handled is solved, and the processing efficiency of multi-source data is effectively improved.
Fig. 4 is a flowchart of a data processing method based on multiple data sources according to an embodiment of the present application, which is an alternative embodiment based on the above-mentioned embodiment.
In this embodiment, according to each data verification result, the to-be-processed data of each raw data is input into the pre-generated initial structured template to generate the target structured data, which may be subdivided into: according to the data verification results, corresponding result identifiers are added to the to-be-processed data of the original data to obtain combined fields of the to-be-processed data; and adding each combined field to a corresponding position in a pre-generated initial structured template to obtain the target structured data.
As shown in fig. 4, the method comprises the steps of:
s401, extracting a data set corresponding to at least two data sources; the data source comprises a plurality of data sets, and each data set comprises at least one piece of original data.
S402, extracting target data in the original data to obtain to-be-processed data of the original data.
And S403, verifying the to-be-processed data of the original data to obtain a data verification result of the to-be-processed data of the original data.
S404, according to the data verification results, corresponding result identifications are added to the to-be-processed data of the original data to obtain the combined fields of the to-be-processed data.
The data verification result may include a verification pass and a verification failure, and different data verification results may correspond to different result identifiers. Result marks corresponding to different data verification results are preset, for example, a result mark of verification passing is "1", and a result mark of verification failure is "0".
Each data to be processed corresponds to a data verification result. And after the data verification result of each piece of data to be processed is obtained, determining a result identifier corresponding to the data verification result, and adding the result identifier to the corresponding data to be processed. The result identifier may be added after the last character of the data to be processed, for example, if the data to be processed is "422142" and the data verification result is verification pass, "1" may be added to the last digit, resulting in "4221421". The data to be processed is an original field to be added to the initial structured template, and the data to be processed with the result identifier added is a combined field. Whether the data to be processed passes the verification or fails the verification, a combined field can be correspondingly generated.
S405, adding each combined field to a corresponding position in a pre-generated initial structured template to obtain target structured data.
After the combined field of each piece of data to be processed is obtained, a specific filling position corresponding to the combined field is searched from the initial structured template, that is, the specific filling position of each piece of data to be processed is searched. And filling each combined field to the corresponding specific filling position to obtain a complete initial test structured template serving as target structured data.
In this embodiment, the user blacklist check may be performed, and whether the user exists in the preset blacklist is determined according to the user information, and if so, it is determined that the user has a bad record, and a blacklist identifier may be added to the target structured data, for example, the blacklist identifier may be added after the user name.
In the business scene of financial transaction, the fund amount transferred by the user can be checked. Calculating the maximum transfer amount of the user according to a preset service rule through transaction information provided by the user, determining the actual transfer amount of the user from the user information, comparing the actual transfer amount of the user with the maximum transfer amount, determining whether the actual transfer amount exceeds the maximum transfer amount, and if so, adding the actual transfer amount to an excess identifier in the target structured data; if not, an un-excess identification may be added to the target structured data.
By adding the identification and generating the combined field, the verification result of the data to be processed can be indicated in the target structured data, the subsequent business processing is facilitated, and the business processing efficiency and precision are improved. The original data in the data source are converted into the structured data, the effects of one-time processing and multiple use are achieved, namely after the data processing is completed through the application, the data can be directly used according to the target structured data when the subsequent business is transacted, namely, the data check is not needed, the structured information is not needed to be regenerated, and the structured data output can be rapidly carried out on the original data. And the storage space of the target structured data is small, which is beneficial to the utilization of the storage space.
According to the data processing method and device, the data set is obtained from multiple different data sources, the original data in the data set is obtained, and the data to be processed is extracted from the original data. The data to be processed of different data sources are verified to obtain a data verification result, and the process that the original data needs to be verified each time is reduced. According to the data verification result, the data to be processed of each data source is input into the same initial structured template to generate target structured data, structured processing of the data of multiple data sources is achieved, a user can directly use the target structured data when transacting business subsequently, and the requirement of acquiring the data in real time is met. The problem that in the prior art, the data in different data sources need to be checked and processed every time the business is handled is solved, and the processing efficiency of multi-source data is effectively improved.
Fig. 5 is a schematic structural diagram of a data processing apparatus based on multiple data sources according to an embodiment of the present application, where the apparatus may be implemented by software, hardware, or a combination of the two. As shown in fig. 5, the apparatus includes: the system comprises a data set extraction module 501, a to-be-processed data obtaining module 502, a data verification result obtaining module 503 and a target structured data generation module 504.
A data set extracting module 501, configured to extract, from at least two data sources, a data set corresponding to the data source; the data source comprises a plurality of data sets, and the data sets comprise at least one piece of original data;
a to-be-processed data obtaining module 502, configured to extract target data in the raw data to obtain to-be-processed data of the raw data;
a data verification result obtaining module 503, configured to verify to-be-processed data of the raw data to obtain a data verification result of to-be-processed data of the raw data;
and a target structured data generation module 504, configured to input to-be-processed data of each piece of raw data into a pre-generated initial structured template according to each data verification result, so as to generate target structured data.
Optionally, the to-be-processed data obtaining module 502 includes:
the target data extraction unit is used for extracting target data from original data in a data set corresponding to the data source according to a preset incidence relation and determining the target data as to-be-processed data of the original data; and the preset association relationship is an association relationship between the data source and the data extraction rule.
Optionally, the target data extracting unit includes:
the target data extraction rule determining subunit is used for determining a target data extraction rule of the data set corresponding to any data source according to the incidence relation between the preset data source and the data extraction rule;
and the target data obtaining subunit is used for obtaining the target data from the original data according to the target data extraction rule.
Optionally, the data set is a video data set, and the original data is an original video in the video data set;
accordingly, the target data obtaining subunit is specifically configured to:
according to a preset video analysis algorithm, extracting the features of the picture in the original video to obtain a video feature value;
and taking the video characteristic value as scene information, constructing a structured scene information set of a video data set, and determining the structured scene information set as the target data.
Optionally, the target data obtaining subunit is further specifically configured to:
determining the number of people in the original video according to a preset video people analysis algorithm;
determining a target service scene identifier of the video data set according to the incidence relation between the number of the characters and the service scene identifier;
according to a preset character feature extraction algorithm, carrying out feature extraction on any character in the original video to obtain a character feature value;
and determining the target service scene identification and the character characteristic value as the video characteristic value.
Optionally, the target data obtaining subunit is further specifically configured to:
acquiring an initial scene information set of a preset video data set;
and inputting the target service scene identification and the character characteristic value into the initial scene information set to obtain a structured scene information set of the video data set.
Optionally, the target data obtaining subunit is further specifically configured to:
acquiring the voice in the original video, converting the voice in the original video into characters according to a preset voice recognition algorithm, and generating a voice text;
acquiring voice keywords from the voice text according to a preset keyword extraction algorithm;
and constructing a structured keyword information set of a video data set according to the voice keywords to serve as the target data.
Optionally, the data set is a user information picture set and/or a service information picture set, and the original data is a user information picture and/or a service information picture;
correspondingly, the target data obtaining subunit is further specifically configured to:
obtaining user information in the user information picture and/or service information in the service information picture through a preset character recognition algorithm;
constructing a structured user information set of the user information picture set according to the user information, and taking the structured user information set as the target data; and/or constructing a structured service information set of the service information picture set according to the service information, wherein the structured service information set is used as the target data.
Optionally, the data verification result obtaining module 503 includes:
and the data verification unit is used for verifying the data to be processed of the original data according to a preset data verification rule to obtain a data verification result of the data to be processed.
Optionally, the data checking unit is specifically configured to:
comparing the data to be processed with pre-stored database information;
and if the data to be processed is consistent with the database information, determining that the data verification result is verification passing.
Optionally, the data to be processed is a structured scene information set and a structured user information set;
correspondingly, the data checking unit is further specifically configured to:
after the data to be processed is consistent with the database information, acquiring character characteristic values of users in the structured scene information set and user information in the structured user information set;
searching a face characteristic value of the user from pre-stored database information according to the user information;
and if the face characteristic value is consistent with the character characteristic value, determining that the data verification result is verification pass.
Optionally, the target structured data generating module 504 includes:
a combined field obtaining unit, configured to add a corresponding result identifier to the to-be-processed data of each original data according to each data verification result, so as to obtain a combined field of each to-be-processed data;
and the combined field adding unit is used for adding each combined field to a corresponding position in a pre-generated initial structured template to obtain the target structured data.
According to the embodiment of the application, the data set is obtained from a plurality of different data sources, the original data in the data set is obtained, and the data to be processed is extracted from the original data. The data to be processed of different data sources are verified to obtain a data verification result, and the process that the original data needs to be verified each time is reduced. According to the data verification result, the data to be processed of each data source is input into the same initial structured template to generate target structured data, structured processing of the data of multiple data sources is achieved, a user can directly use the target structured data when transacting business subsequently, and the requirement of acquiring the data in real time is met. The problem that in the prior art, the data in different data sources need to be checked and processed every time the business is handled is solved, and the processing efficiency of multi-source data is effectively improved.
Fig. 6 is a schematic diagram illustrating an architecture of an electronic device, which may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a tablet device, a personal digital assistant, etc., in accordance with an exemplary embodiment.
The apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the apparatus 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power components 806 provide power to the various components of device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 800.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, audio component 810 includes a Microphone (MIC) configured to receive external audio signals when apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed status of the device 800, the relative positioning of components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in the position of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, the orientation or acceleration/deceleration of the device 800, and a change in the temperature of the device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer-readable storage medium, wherein instructions of the storage medium, when executed by a processor of a terminal device, enable the terminal device to perform a data processing method based on multiple data sources of the electronic device.
The application also discloses a computer program product comprising a computer program which, when executed by a processor, implements the method as described in the embodiments.
Various implementations of the systems and techniques described here above may be realized in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present application may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or electronic device.
In the context of this application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data electronic device), or that includes a middleware component (e.g., an application electronic device), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include a client and an electronic device. The client and the electronic device are generally remote from each other and typically interact through a communication network. The relationship of client and electronic device arises by virtue of computer programs running on the respective computers and having a client-electronic device relationship to each other. The electronic device may be a cloud electronic device, which is also called a cloud computing electronic device or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service extensibility in a traditional physical host and a VPS service ("Virtual Private Server", or "VPS" for short). The electronic device may also be a distributed system of electronic devices or an electronic device incorporating a blockchain. It should be understood that various forms of the flows shown above, reordering, adding or deleting steps, may be used. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (16)

1. A data processing method based on multiple data sources is characterized by comprising the following steps:
extracting, from at least two data sources, data sets corresponding to the data sources; the data source comprises a plurality of data sets, and the data sets comprise at least one piece of original data;
extracting target data in the original data to obtain data to be processed of the original data;
verifying the data to be processed of the original data to obtain a data verification result of the data to be processed of the original data;
and inputting the data to be processed of each original data into a pre-generated initial structured template according to each data verification result to generate target structured data.
2. The method of claim 1, wherein extracting target data from the raw data to obtain data to be processed of the raw data comprises:
extracting target data from original data in a data set corresponding to the data source according to a preset incidence relation, and determining the target data as to-be-processed data of the original data; the preset association relationship is an association relationship between a data source and a data extraction rule.
3. The method according to claim 2, wherein extracting target data from original data in a data set corresponding to the data source according to a preset association relationship comprises:
determining a target data extraction rule of the data set corresponding to any data source according to a preset incidence relation between the data source and the data extraction rule;
and obtaining the target data from the original data according to the target data extraction rule.
4. The method of claim 3, wherein the data set is a video data set, and the raw data is raw video in the video data set;
correspondingly, obtaining the target data from the original data according to the target data extraction rule includes:
according to a preset video analysis algorithm, extracting the features of the picture in the original video to obtain a video feature value;
and taking the video characteristic value as scene information, constructing a structured scene information set of a video data set, and determining the structured scene information set as the target data.
5. The method according to claim 4, wherein extracting features of the pictures in the original video according to a preset video analysis algorithm to obtain video feature values comprises:
determining the number of people in the original video according to a preset video people analysis algorithm;
determining a target service scene identifier of the video data set according to the incidence relation between the number of the characters and the service scene identifier;
according to a preset character feature extraction algorithm, carrying out feature extraction on any character in the original video to obtain a character feature value;
and determining the target service scene identification and the character characteristic value as the video characteristic value.
6. The method of claim 5, wherein constructing a structured scene information set of a video data set using the video feature values as scene information comprises:
acquiring an initial scene information set of a preset video data set;
and inputting the target service scene identification and the character characteristic value into the initial scene information set to obtain a structured scene information set of the video data set.
7. The method of claim 4, wherein the target data is derived from the raw data according to the target data extraction rules, further comprising:
acquiring the voice in the original video, converting the voice in the original video into characters according to a preset voice recognition algorithm, and generating a voice text;
acquiring voice keywords from the voice text according to a preset keyword extraction algorithm;
and constructing a structural keyword information set of a video data set according to the voice keywords to serve as the target data.
8. The method according to claim 3, wherein the data set is a user information picture set and/or a service information picture set, and the raw data is a user information picture and/or a service information picture;
correspondingly, obtaining the target data from the original data according to the target data extraction rule includes:
obtaining user information in the user information picture and/or service information in the service information picture through a preset character recognition algorithm;
constructing a structured user information set of the user information picture set according to the user information, and using the structured user information set as the target data; and/or constructing a structured service information set of the service information picture set according to the service information, wherein the structured service information set is used as the target data.
9. The method according to claim 1, wherein verifying the to-be-processed data of the original data to obtain a data verification result of the to-be-processed data of the original data comprises:
and according to a preset data verification rule, verifying the data to be processed of the original data to obtain a data verification result of the data to be processed.
10. The method according to claim 9, wherein verifying the to-be-processed data of the original data according to a preset data verification rule to obtain a data verification result of the to-be-processed data comprises:
comparing the data to be processed with pre-stored database information;
and if the data to be processed is consistent with the database information, determining that the data verification result is verification passing.
11. The method according to claim 10, wherein the data to be processed is a structured scene information set and a structured user information set;
correspondingly, after the data to be processed is consistent with the database information, the method further comprises the following steps:
acquiring character characteristic values of users in the structured scene information set and user information in the structured user information set;
searching a face characteristic value of the user from pre-stored database information according to the user information;
and if the face characteristic value is consistent with the character characteristic value, determining that the data verification result is verification pass.
12. The method according to claim 1, wherein inputting the data to be processed of each of the raw data into a pre-generated initial structured template according to each of the data verification results to generate target structured data, comprises:
according to the data verification result, adding a corresponding result identifier in the data to be processed of each original data to obtain a combined field of each data to be processed;
and adding each combined field to a corresponding position in a pre-generated initial structured template to obtain the target structured data.
13. A data processing apparatus based on multiple data sources, comprising:
the data set extraction module is used for extracting data sets corresponding to at least two data sources; the data source comprises a plurality of data sets, and the data sets comprise at least one piece of original data;
a to-be-processed data obtaining module, configured to extract target data in the raw data to obtain to-be-processed data of the raw data;
a data verification result obtaining module, configured to verify to-be-processed data of the raw data to obtain a data verification result of the to-be-processed data of the raw data;
and the target structured data generation module is used for inputting the data to be processed of each original data into a pre-generated initial structured template according to each data verification result so as to generate target structured data.
14. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored by the memory to implement the multiple data source-based data processing method of any one of claims 1-12.
15. A computer-readable storage medium having stored thereon computer-executable instructions for implementing the multiple data source-based data processing method of any one of claims 1 to 12 when executed by a processor.
16. A computer program product comprising a computer program which, when executed by a processor, implements the multiple data source based data processing method of any one of claims 1-12.
CN202210468742.1A 2022-04-29 2022-04-29 Data processing method, device and equipment based on multiple data sources and storage medium Pending CN114723551A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210468742.1A CN114723551A (en) 2022-04-29 2022-04-29 Data processing method, device and equipment based on multiple data sources and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210468742.1A CN114723551A (en) 2022-04-29 2022-04-29 Data processing method, device and equipment based on multiple data sources and storage medium

Publications (1)

Publication Number Publication Date
CN114723551A true CN114723551A (en) 2022-07-08

Family

ID=82244793

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210468742.1A Pending CN114723551A (en) 2022-04-29 2022-04-29 Data processing method, device and equipment based on multiple data sources and storage medium

Country Status (1)

Country Link
CN (1) CN114723551A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116579750A (en) * 2023-07-13 2023-08-11 南京元圈软件科技有限公司 RPA control data processing method and device based on artificial intelligence

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116579750A (en) * 2023-07-13 2023-08-11 南京元圈软件科技有限公司 RPA control data processing method and device based on artificial intelligence
CN116579750B (en) * 2023-07-13 2023-09-12 南京元圈软件科技有限公司 RPA control data processing method and device based on artificial intelligence

Similar Documents

Publication Publication Date Title
US10930010B2 (en) Method and apparatus for detecting living body, system, electronic device, and storage medium
US8917913B2 (en) Searching with face recognition and social networking profiles
US10452890B2 (en) Fingerprint template input method, device and medium
RU2669063C2 (en) Method and device for image acquisition
CN109271850B (en) Merchant information uploading method and device, electronic equipment and storage medium
CN110781813B (en) Image recognition method and device, electronic equipment and storage medium
CN114240882A (en) Defect detection method and device, electronic equipment and storage medium
CN114723551A (en) Data processing method, device and equipment based on multiple data sources and storage medium
US20160350622A1 (en) Augmented reality and object recognition device
CN111079421B (en) Text information word segmentation processing method, device, terminal and storage medium
CN111797746A (en) Face recognition method and device and computer readable storage medium
US20170076368A1 (en) Method and Device for Processing Card Application Data
CN116912478A (en) Object detection model construction, image classification method and electronic equipment
CN116630074A (en) Invoice reimbursement method, invoice reimbursement device, electronic equipment and storage medium
CN111666936A (en) Labeling method, labeling device, labeling system, electronic equipment and storage medium
US20210112057A1 (en) Multi-party document validation
CN114090738A (en) Method, device and equipment for determining scene data information and storage medium
CN111626883A (en) Authority verification method and device, electronic equipment and storage medium
CN115329390B (en) Financial privacy information security auditing method and device based on privacy protection calculation
US20240040232A1 (en) Information processing apparatus, method thereof, and program thereof, and information processing system
CN116645052A (en) Method, device, equipment and storage medium for auditing service information
CN111932500A (en) Image processing method and device
CN116723272A (en) Voice information pushing method, device, equipment and storage medium
CN116383184A (en) Data detection method and device based on data table, electronic equipment and storage medium
WO2019090617A1 (en) People counting method and people counting system based on intelligent terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination