CN107016050B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN107016050B
CN107016050B CN201710106370.7A CN201710106370A CN107016050B CN 107016050 B CN107016050 B CN 107016050B CN 201710106370 A CN201710106370 A CN 201710106370A CN 107016050 B CN107016050 B CN 107016050B
Authority
CN
China
Prior art keywords
data
target
voice format
format data
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710106370.7A
Other languages
Chinese (zh)
Other versions
CN107016050A (en
Inventor
杨柳
何伟
胡红艳
索娟
李雅洁
高阳
马斌
李志刚
王天军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Information and Telecommunication Branch of State Grid Xinjiang Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Information and Telecommunication Branch of State Grid Xinjiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Information and Telecommunication Branch of State Grid Xinjiang Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201710106370.7A priority Critical patent/CN107016050B/en
Publication of CN107016050A publication Critical patent/CN107016050A/en
Application granted granted Critical
Publication of CN107016050B publication Critical patent/CN107016050B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing

Abstract

The invention discloses a data processing method and device. Wherein, the method comprises the following steps: acquiring data to be processed, wherein the data to be processed is data used for reflecting work order information of a target object, and the data type of the data to be processed comprises a target data type; determining a target data dividing mode corresponding to the target data type; and carrying out data division on the data with the data type as the target data type according to the target data division mode. The invention solves the technical problem of low data processing efficiency caused by unified processing of various types of data in the related technology.

Description

Data processing method and device
Technical Field
The present invention relates to the field of data processing, and in particular, to a data processing method and apparatus.
Background
With the deep knowledge and adjustment of the core competitiveness, the ability of customer service has become one of the most core values of enterprises, and customer service centers have come into play. The customer service Center is also called a Call Center (Call Center) or Telemarketing (Telemarketing), and it makes full use of the multiple functions of communication network and Computer network to integrate with the enterprise to form a complete integrated information service system based on CTI (Computer Telephony Integration) technology. The customer service center is a direct window for communication between enterprises and customers, and data for information interaction generated in the communication process plays a very important role in unified coordination of sales, scheduling, management, personnel assessment and value increment of the whole enterprise.
Therefore, in order to effectively utilize the data of information interaction generated in the communication process, the data needs to be processed so as to analyze the data by using the data and mine effective information in the data.
In the prior art, for the data of information interaction generated in the communication process, the data is generally processed uniformly according to the time sequence of the data in the information interaction. However, image and voice data are not collected during the data processing, and in the data processing, the method generally used is that each data is independently stored, and distributed management is performed, so that a "data island" is formed, which is not beneficial to data processing and utilization.
Aiming at the problem that the data processing efficiency is low due to the fact that the related technologies carry out unified processing on various types of data, no effective solution is provided at present.
Disclosure of Invention
The embodiment of the invention provides a data processing method and device, which at least solve the technical problem of low data processing efficiency caused by unified processing of various types of data in the related technology.
According to an aspect of an embodiment of the present invention, there is provided a data processing method including: acquiring data to be processed, wherein the data to be processed is data used for reflecting work order information of a target object, and the data type of the data to be processed at least comprises a target data type; determining a target data dividing mode corresponding to the target data type; and performing data division on the data with the data type as the target data type according to the target data division mode.
Further, the target data type includes at least one of: image format data; voice format data; structured text format data.
Further, when the type of the target data is the image format data, the target data is divided into a manner of segmenting the image format data according to a geometric shape; when the target data type is the voice format data, the target data dividing mode is a mode of combining the voice format data of which the data volume is lower than a preset threshold value; and under the condition that the target data type is the structured text format data, the target data dividing mode is a mode of splitting a data table corresponding to the structured text format data.
Further, in a case that the target data type is the voice format data, the data partitioning the data of which the data type is the target data type according to the target data partitioning manner includes: acquiring the data volume of the voice format data; judging whether the data volume of the voice format data is lower than a preset threshold value or not; under the condition that the data volume of the voice format data is lower than the preset threshold value, determining the voice format data as voice format data to be merged; and merging the voice format data to be merged.
Further, the merging the voice format data to be merged includes: executing the following merging operation on the voice format data to be merged to obtain a voice format data block until the data volume of the voice format data block is not lower than the preset threshold value, wherein the voice format data to be merged is marked as the current voice format data when the merging operation is executed: merging the current voice format data into the voice format data block; judging whether the data volume of the voice format data block is lower than the preset threshold value or not; and determining the next voice format data as the current voice format data under the condition that the data quantity of the voice format data block is lower than the preset threshold value.
Further, after the data of which the data type is the target data type is divided according to the target data division manner, the method further includes: and storing a target data block obtained by dividing the data with the data type as the target data type in a target database.
Further, after the target data block obtained by dividing the data with the data type of the target data type is stored in the target database, the method further includes: and setting a target index mode for the data with the data type as the target data type in the target database.
According to another aspect of the embodiments of the present invention, there is also provided a data processing apparatus, including: the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring data to be processed, the data to be processed is used for reflecting work order information of a target object, and the data type of the data to be processed at least comprises a target data type; the determining unit is used for determining a target data dividing mode corresponding to the target data type; and the dividing unit is used for carrying out data division on the data with the data type as the target data type according to the target data dividing mode.
Further, the target data type includes at least one of: image format data; voice format data; structured text format data.
Further, the image dividing module is configured to, when the target data type is the image format data, divide the target data in a manner of splitting the image format data according to a geometric shape; the voice dividing module is used for merging the voice format data with the data volume lower than a preset threshold under the condition that the target data type is the voice format data; and the text dividing module is used for dividing the target data into a data table corresponding to the structured text format data under the condition that the target data type is the structured text format data.
Further, in a case that the target data type is the voice format data, the dividing unit includes: the acquisition module is used for acquiring the data volume of the voice format data; the judging module is used for judging whether the data volume of the voice format data is lower than a preset threshold value or not; the determining module is used for determining the voice format data as the voice format data to be merged under the condition that the data volume of the voice format data is lower than the preset threshold value; and the merging module is used for merging the voice format data to be merged.
Further, the merging module includes: executing the following merging operation on the voice format data to be merged to obtain a voice format data block until the data volume of the voice format data block is not lower than the preset threshold value, wherein the voice format data to be merged is marked as the current voice format data when the merging operation is executed: a merging submodule, configured to merge the current voice format data into the voice format data block; the judging submodule is used for judging whether the data volume of the voice format data block is lower than the preset threshold value or not; and the determining submodule is used for determining the next voice format data as the current voice format data under the condition that the data quantity of the voice format data block is lower than the preset threshold value.
Further, after the dividing unit, the apparatus further includes: and the storage module is used for storing a target data block obtained by dividing the data with the data type as the target data type in a target database.
Further, after the storing module, the apparatus further comprises: and the index module is used for setting a target index mode for the data with the data type as the target data type in the target database.
In the embodiment of the invention, to-be-processed data for reflecting the work order information of the target object and the target data type corresponding to the to-be-processed data are obtained, the target data dividing mode corresponding to the target data type is determined according to the obtained target data type of the to-be-processed data, and then the data with the data type being the target data type are divided according to the target data dividing mode. By adopting the data processing method and device, various types of data are processed respectively according to the data dividing modes corresponding to the various types of data, and the purpose of performing different processing on the different types of data is achieved, so that the technical effect of improving the data processing efficiency is achieved, and the technical problem of low data processing efficiency caused by unified processing on the various types of data in the related technology is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow diagram of an alternative data processing method according to an embodiment of the invention;
FIG. 2 is a schematic diagram of an alternative Hadoop cluster environment according to embodiments of the invention;
FIG. 3(a) is a schematic diagram of an alternative horizontally sliced image formatted data according to an embodiment of the present invention;
FIG. 3(b) is a schematic diagram of an alternative vertically sliced image format data according to an embodiment of the present invention;
FIG. 3(c) is a schematic diagram of an alternative tile sliced image format data according to an embodiment of the present invention;
FIG. 3(d) is a schematic diagram of an alternative irregular split image format data according to an embodiment of the invention;
FIG. 4 is a diagram illustrating an alternative voice formatted data merging method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an alternative manner of indexing image format data according to an embodiment of the invention;
FIG. 6 is a schematic illustration of an alternative storage of data for work order information in accordance with an embodiment of the present invention;
FIG. 7 is a schematic diagram of an alternative data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In accordance with an embodiment of the present invention, there is provided a data processing method embodiment, it should be noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.
Fig. 1 is a flow chart of an alternative data processing method according to an embodiment of the present invention, as shown in fig. 1, the method including the steps of:
step S102, acquiring data to be processed, wherein the data to be processed is data used for reflecting work order information of a target object, and the data type of the data to be processed at least comprises a target data type;
step S104, determining a target data dividing mode corresponding to the target data type;
and step S106, performing data division on the data with the data type as the target data type according to the target data division mode.
Through the steps, the data to be processed for reflecting the work order information of the target object and the target data type corresponding to the data to be processed are obtained, the target data dividing mode corresponding to the target data type is determined according to the obtained target data type of the data to be processed, and then the data with the data type being the target data type is divided according to the target data dividing mode. By adopting the data processing method and device, various types of data are processed respectively according to the data dividing modes corresponding to the various types of data, and the purpose of performing different processing on the different types of data is achieved, so that the technical effect of improving the data processing efficiency is achieved, and the technical problem of low data processing efficiency caused by unified processing on the various types of data in the related technology is solved.
In the scheme provided in step S102, the data to be processed is data for reflecting the work order information of the target object. The work order information is information for transmitting work instructions or work contents in an enterprise, for example, the enterprise communicates with a client through a client service center, and in the communication process, data information to be used and communicated is generated, the data information can be various data information such as a client voice file, an error interface screenshot, a work order description text, a client feedback text and the like, the data information is data of the work order information, and an object for sending the data of the work order information is a target object.
In an alternative embodiment, the data to be processed includes at least a target data type, which may be some predetermined format data, for example, the target data type may be image format data, the target data type may be structured text format data, and the target data type may be voice format data.
In the scheme provided in step S104, different target data types correspond to different target data partition manners, and the target data partition manner corresponding to the target data may be determined according to the type of the target data. For example, in the case where the type of the target data is image format data, the target data dividing manner to which the image format data can correspond is horizontal slicing.
In the scheme provided in step S106, the data of the target data type is divided according to the target data dividing manner corresponding to the target data type.
As an alternative embodiment, the target data type may include at least one of: image format data; voice format data; structured text format data. By adopting the method and the device, the target data type is determined according to the format of the data to be processed, so that the target data type comprises image format data, voice format data and structured text format data, and the data in different formats has different data characteristics, so that the processing modes corresponding to the data in different formats are different, and the target data type determined according to the format of the data can comprise the image format data; voice format data; the text format data is structured, so that corresponding division modes can be set for different formats, and the data object division modes corresponding to the data of different object data types can be determined conveniently.
As an optional embodiment, when the type of the target data is image format data, the target data is divided into a manner of segmenting the image format data according to a geometric shape; under the condition that the target data type is voice format data, the target data dividing mode is a mode of combining the voice format data of which the data volume is lower than a preset threshold value; and under the condition that the target data type is the structured text format data, the target data division mode is a mode of splitting a data table corresponding to the structured text format data. By adopting the method and the device, different data processing modes can be determined according to different target data types, so that different types of data can be processed conveniently, and the data processing efficiency is improved.
Optionally, when the target data type is image format data, the image format data may be divided into a plurality of geometric data by horizontal segmentation, vertical segmentation, rectangular block segmentation, or irregular segmentation, so as to divide one data with a large information capacity into a plurality of data with a small information capacity, thereby facilitating data processing.
Optionally, when the type of the target data is structured text format data, the structured text format data may be stored in a corresponding data table, and then the data table corresponding to the structured text format data is split to obtain a plurality of sub data tables, so that one data with a large information capacity is divided into a plurality of data with a small information capacity, which is convenient for data processing.
As an optional embodiment, in the case that the target data type is data in a voice format, the data partitioning the data of which the data type is the target data type according to the target data partitioning manner may include: acquiring the data volume of the voice format data; judging whether the data volume of the voice format data is lower than a preset threshold value or not; under the condition that the data volume of the voice format data is lower than a preset threshold value, determining the voice format data as the voice format data to be merged; and merging the voice format data to be merged. By adopting the invention, the voice format data with the data volume lower than the preset threshold value is merged by judging the data volume of the voice format data, so that the voice format data with the data volume higher than the preset threshold value is obtained, the integration of the voice format data with smaller data volume can be realized, the number of the voice format data is reduced, and the data processing is convenient.
As an alternative embodiment, the merging the voice format data to be merged may include: executing the following merging operation on the voice format data to be merged to obtain a voice format data block until the data volume of the voice format data block is not lower than a preset threshold value, wherein the voice format data to be merged is marked as current voice format data when the merging operation is executed: merging the current voice format data into a voice format data block; judging whether the data volume of the voice format data block is lower than a preset threshold value or not; and determining the next voice format data as the current voice format data in the case that the data amount of the voice format data block is lower than a predetermined threshold value. By adopting the invention, the voice format data with the data volume lower than the preset threshold value is merged into the voice format data with the data volume higher than the preset threshold value, so that the integration of the voice format data with smaller data volume is realized, the number of the voice format data is reduced, and the data processing is convenient.
As an optional embodiment, after performing data partitioning on data of which the data type is the target data type according to the target data partitioning manner, the embodiment may further include: and storing a target data block obtained by dividing the data with the data type as the target data type in a target database. By adopting the method and the device, the data of the target data type is divided according to the target data dividing mode corresponding to the target data type to obtain the target data block, and the target data block is stored in the target database, so that the data processing is convenient.
Optionally, the divided target data blocks are stored in a target database, the target database corresponding to the target data blocks may be determined according to the target data types corresponding to the target data blocks, and the target data blocks are stored in the corresponding target databases.
As an optional embodiment, after storing a target data block obtained by dividing data of which the data type is the target data type in the target database, the embodiment may further include: and setting a target index mode for the data with the data type as the target data type in the target database. By adopting the method and the device, the corresponding target index mode can be set according to the target data type corresponding to the target data block stored in the target database, the specific target index mode is used for inquiring the target data block in a targeted manner, and the index speed can be improved.
The invention also provides a preferred embodiment, and the preferred embodiment provides a data processing method applied to a multi-element heterogeneous work order big data distributed storage and analysis platform.
In enterprises, with continuous development of business, the number of work orders is multiplied in a geometric series, and a customer service center accumulates a large amount of work order information data, including multi-source heterogeneous data information such as customer voice files, error interface screenshots, work order description texts, customer feedback texts and the like. These data can be used as the primary data source to provide data support for data analysis. For example, the service quality of the customer service staff can be objectively evaluated by analyzing the text information fed back by the customer and processing the total number of the work orders, the capability level of the customer service staff can be evaluated, and a very important unified coordination effect is achieved.
However, if the data of the work order information lacks the uniform standard and specification for collection and storage, the research on the data adopts a mode of independent storage and decentralized management, so that a data island is formed, and the data processing and the data utilization are not facilitated.
The data processing method aims at the problems that the data processing efficiency is low, the data utilization degree is low and the like due to the fact that the data of the work order information are stored independently and managed dispersedly, and can be used for processing the data according to the fact that the data of the multi-source heterogeneous work order information have the characteristics of high volume, heterogeneity, complexity and dynamics. The specific process is as follows:
1. according to the characteristics of the data of the multi-source heterogeneous work order information, a fusion storage model of the data of the work order information is established on a Hadoop cluster environment;
2. on the basis of fusing the storage models, establishing a proper index mode for each type of data, and improving the efficiency of data query;
3. and performing data analysis based on the fusion storage model and the corresponding index mode, and visually displaying a data analysis result on a Web end interface.
It should be noted that Hadoop is an open-source distributed system infrastructure, provides high throughput to access data of application programs, is suitable for application programs with huge data sets, and makes full use of the computing function of the cluster to perform high-speed operation and storage.
FIG. 2 is a schematic diagram of an alternative Hadoop cluster environment according to an embodiment of the present invention, and as shown in FIG. 2, the cluster environment may include: the Hadoop cluster comprises a server and a Web end, wherein the Hadoop cluster at the server can comprise a main server and a plurality of auxiliary servers, and the main server is connected with the auxiliary servers through a network; and displaying analysis results such as work order type statistic analysis, work order event statistics analysis, module fault frequency ranking, customer order receiving quantity ranking, customer service ranking and the like on a Web end interface.
As an alternative embodiment, the specific manner of data processing is as follows:
(1) a Hadoop cluster environment based on HDFS distributed files can be built under Linux, and data of work order information are classified according to data formats.
It should be noted that Linux is a multi-user network operating system with stable performance. HDFS, known as the Hsdoop Distributed File System, is a Distributed File System, in chinese name, designed to fit on common hardware.
Optionally, the data of the work order information is classified according to a data format, and may be classified into voice format data in wav format, image format data in jpg format, and structured text format data.
(2) In massive data parallel computing, the division of data blocks is an important part of parallel processing, and the division mode of the data blocks, the size of the data blocks and the parallel computing efficiency are closely related. In order to improve the retrieval rate of the work order data, different data block division modes can be adopted for the data of different types of work order information.
1) The image format data in the jpg format is divided.
Fig. 3(a) is a schematic diagram of an alternative horizontal slicing image format data according to an embodiment of the present invention, and as shown in fig. 3(a), for the Jpg format image data, the image may be sliced in a horizontal slicing manner.
Fig. 3(b) is a schematic diagram of an alternative vertical slicing of image format data according to an embodiment of the present invention, and as shown in fig. 3(b), for Jpg format image data, the image may be sliced in a vertical slicing manner.
Fig. 3(c) is a schematic diagram of an alternative rectangular block splitting image format data according to an embodiment of the present invention, and as shown in fig. 3(c), for the Jpg format image data, the image may be split in a rectangular block splitting manner.
Fig. 3(d) is a schematic diagram of an alternative irregular-cut image format data according to an embodiment of the present invention, and as shown in fig. 3(d), for the Jpg format image data, the image may be cut by irregular-cut.
2) Voice format data in wav format is divided.
The Wav voice file is usually small, and if the call time of the client is less than 5 minutes, the data volume of the corresponding voice format data is less than 5M. The Hadoop cluster utilizes a NameNode main node to store information metadata of data blocks in the cluster, and when the storage of the small files is less than 5M, the operating pressure of the NameNode node rises sharply. Therefore, the data merging strategy is adopted to merge the Wav voice 'doclets'.
It should be noted that NameNode is a namespace for managing system files, and it maintains the file system tree and all the files and directories in the whole tree.
Fig. 4 is a schematic diagram of an alternative way of merging the voice format data according to an embodiment of the present invention, as shown in fig. 4, numbers 1 to 7 are voice format data below a threshold, where the height of the graph represents the data amount of the voice format data, the voice files numbers 1, 2, and 3 are merged, the voice files numbers 5 and 6 are merged, and the voice files numbers 4 and 7 are merged, which may form a voice format data block with a data amount above the threshold.
Optionally, metadata information such as the merging information of the voice format data block, the corresponding worksheet number, and the like is stored in the HBase database.
It should be noted that HBase, called Hadoop Database, is a highly reliable, high-performance, column-oriented, scalable distributed storage system.
3) The structured text format data is divided.
And (3) directly storing the structural chemical engineering single information data into an HBase database to generate a data table, splitting the data table, and then storing the data table on the HDFS in an HFile file form.
It should be noted that HFile is a file organization form of HBase storage data.
(3) And (5) carrying out fragment index research under the HDFS, and constructing a proper index mode for each type of data.
Fig. 5 is a schematic diagram of an alternative image format data indexing method according to an embodiment of the present invention, and as shown in fig. 5, for image format data, a quadtree spatial index oriented to an image pyramid is tried, and the image format data is indexed layer by blocks (the minimum storage and processing unit in a database), where the data includes multiple blocks, and each Block is numbered in order of hierarchy, for example, an nth layer (Level n), and the Block number is B0(ii) a Layer n +1 (Level n +1), Block number B01、B02、B03、B04
Optionally, for the structured chemical order data, a multi-level index may be established in the HBase database, and for example, before indexing the work order number, an index based on an area is performed first, then an index based on a system is performed, and then an index based on a module is performed to form a 3-level index of the work order information, so as to improve the indexing speed.
(4) By analyzing the data of the work order information, the data analysis such as work order type statistics, work order statistics, module fault frequency scheduling, customer service order receiving amount scheduling, customer service scheduling and the like is realized, and the analysis result is displayed on a Web end.
Fig. 6 is a schematic diagram of optional work order information data storage according to an embodiment of the present invention, and as shown in fig. 6, an HDFS client reads a request for a data block and sends the request to an HDFS, the HDFS acquires a data block name according to the request HBase and acquires a data node where the data block is located in a node name, then reads the requested data block from a plurality of data nodes through an HDFS access interface, and after the reading is completed, the HDFS access interface sends an instruction to close connection to the HDFS client.
As shown in fig. 6, a plurality of format types of data of the work order information, such as voice format data, image format data, and structured text format data, are recorded in the HDFS, wherein the image format data includes an error interface screenshot, the voice format data includes customer query voice and customer service voice, and the structured text format data includes the work order text information and the work order text information generated from the data sheet.
As shown in fig. 6, information of multiple format types of the data of the work order information, such as voice format data, image format data, and structured text format data, is recorded in the HBase, where the data of the image format includes information such as data block information, column number of the data block, and corresponding work order number; the structured text format data comprises data sub-table information, word table numbers and other information; the voice format data comprises voice block information, voice block numbers, corresponding work order numbers and other information.
As shown in fig. 6, each data block node includes a data block and a copy, and each data block is the minimum storage and processing unit in the database.
According to an embodiment of the present invention, there is also provided an embodiment of a data processing apparatus, and it should be noted that the data processing apparatus may be configured to execute the data processing method in the embodiment of the present invention, and the data processing method in the embodiment of the present invention may be executed in the data processing apparatus.
Fig. 7 is a schematic diagram of an alternative data processing apparatus according to an embodiment of the present invention, as shown in fig. 7, the apparatus may include: the acquiring unit 71 is configured to acquire data to be processed, where the data to be processed is data used for reflecting work order information of a target object, and a data type of the data to be processed at least includes a target data type; a determining unit 73, configured to determine a target data dividing manner corresponding to a target data type; the dividing unit 75 is configured to perform data division on data of which the data type is the target data type according to the target data division manner.
It should be noted that the obtaining unit 71 in this embodiment may be configured to execute step S102 in this embodiment, the determining unit 73 in this embodiment may be configured to execute step S104 in this embodiment, and the dividing unit 75 in this embodiment may be configured to execute step S106 in this embodiment. The modules are the same as the corresponding steps in the realized examples and application scenarios, but are not limited to the disclosure of the above embodiments.
By the embodiment, the to-be-processed data for reflecting the work order information of the target object and the target data type corresponding to the to-be-processed data are obtained, the target data dividing mode corresponding to the target data type is determined according to the obtained target data type of the to-be-processed data, and then the data with the data type being the target data type are divided according to the target data dividing mode. By adopting the data processing method and device, various types of data are processed respectively according to the data dividing modes corresponding to the various types of data, and the purpose of performing different processing on the different types of data is achieved, so that the technical effect of improving the data processing efficiency is achieved, and the technical problem of low data processing efficiency caused by unified processing on the various types of data in the related technology is solved.
As an alternative embodiment, the target data type may include at least one of: image format data; voice format data; structured text format data.
As an optional embodiment, the image dividing module is configured to, when the target data type is image format data, divide the target data in a manner of splitting the image format data according to a geometric shape; the voice dividing module is used for merging the voice format data with the data volume lower than a preset threshold under the condition that the target data type is the voice format data; and the text division module is used for splitting the data table corresponding to the structured text format data in the target data division mode under the condition that the target data type is the structured text format data.
As an alternative embodiment, in the case that the target data type is voice format data, wherein the dividing unit may include: the acquisition module is used for acquiring the data volume of the voice format data; the judging module is used for judging whether the data volume of the voice format data is lower than a preset threshold value or not; the determining module is used for determining the voice format data as the voice format data to be merged under the condition that the data volume of the voice format data is lower than a preset threshold value; and the merging module is used for merging the voice format data to be merged.
As an alternative embodiment, the merging module may include: executing the following merging operation on the voice format data to be merged to obtain a voice format data block until the data volume of the voice format data block is not lower than a preset threshold value, wherein the voice format data to be merged is marked as current voice format data when the merging operation is executed: the merging submodule is used for merging the current voice format data into a voice format data block; the judging submodule is used for judging whether the data volume of the voice format data block is lower than a preset threshold value or not; and the determining submodule is used for determining the next voice format data as the current voice format data under the condition that the data quantity of the voice format data block is lower than a preset threshold value.
As an optional embodiment, after dividing the unit, the embodiment may further include: and the storage module is used for storing a target data block obtained by dividing the data with the data type as the target data type in a target database.
As an alternative embodiment, after storing the module, the embodiment may further include: and the index module is used for setting a target index mode for the data with the data type as the target data type in the target database.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A data processing method, comprising:
acquiring data to be processed, wherein the data to be processed is data used for reflecting work order information of a target object, and the data type of the data to be processed at least comprises a target data type;
determining a target data dividing mode corresponding to the target data type;
performing data division on the data with the data type as the target data type according to the target data division mode;
wherein the target data type comprises at least one of: image format data; voice format data; structured text format data;
the method is characterized in that when the target data type is the image format data, the target data is divided into a mode of segmenting the image format data according to a geometric shape;
when the target data type is the voice format data, the target data dividing mode is a mode of combining the voice format data of which the data volume is lower than a preset threshold value;
and under the condition that the target data type is the structured text format data, the target data dividing mode is a mode of splitting a data table corresponding to the structured text format data.
2. The method according to claim 1, wherein in a case that the target data type is the voice format data, the data partitioning of the data of which the data type is the target data type according to the target data partitioning manner comprises:
acquiring the data volume of the voice format data;
judging whether the data volume of the voice format data is lower than a preset threshold value or not;
under the condition that the data volume of the voice format data is lower than the preset threshold value, determining the voice format data as voice format data to be merged;
and merging the voice format data to be merged.
3. The method according to claim 2, wherein said merging the voice format data to be merged comprises:
executing the following merging operation on the voice format data to be merged to obtain a voice format data block until the data volume of the voice format data block is not lower than the preset threshold value, wherein the voice format data to be merged is marked as the current voice format data when the merging operation is executed:
merging the current voice format data into the voice format data block;
judging whether the data volume of the voice format data block is lower than the preset threshold value or not;
and determining the next voice format data as the current voice format data under the condition that the data quantity of the voice format data block is lower than the preset threshold value.
4. The method according to claim 1, wherein after the data partitioning of the data of which the data type is the target data type according to the target data partitioning manner, the method further comprises:
and storing a target data block obtained by dividing the data with the data type as the target data type in a target database.
5. The method according to claim 4, wherein after the target data block obtained by dividing the data of which the data type is the target data type is stored in a target database, the method further comprises:
and setting a target index mode for the data with the data type as the target data type in the target database.
6. A data processing apparatus, comprising:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring data to be processed, the data to be processed is used for reflecting work order information of a target object, and the data type of the data to be processed at least comprises a target data type;
the determining unit is used for determining a target data dividing mode corresponding to the target data type;
the dividing unit is used for carrying out data division on the data with the data type as the target data type according to the target data dividing mode;
wherein the target data type comprises at least one of: image format data; voice format data; structured text format data;
the image dividing module is used for dividing the target data into a mode of segmenting the image format data according to a geometric shape under the condition that the type of the target data is the image format data;
the voice dividing module is used for merging the voice format data with the data volume lower than a preset threshold under the condition that the target data type is the voice format data;
and the text dividing module is used for dividing the target data into a data table corresponding to the structured text format data under the condition that the target data type is the structured text format data.
7. The apparatus according to claim 6, wherein in the case that the target data type is the voice format data, the dividing unit comprises:
the acquisition module is used for acquiring the data volume of the voice format data;
the judging module is used for judging whether the data volume of the voice format data is lower than a preset threshold value or not;
the determining module is used for determining the voice format data as the voice format data to be merged under the condition that the data volume of the voice format data is lower than the preset threshold value;
and the merging module is used for merging the voice format data to be merged.
8. The apparatus of claim 7, wherein the merging module comprises:
executing the following merging operation on the voice format data to be merged to obtain a voice format data block until the data volume of the voice format data block is not lower than the preset threshold value, wherein the voice format data to be merged is marked as the current voice format data when the merging operation is executed:
a merging submodule, configured to merge the current voice format data into the voice format data block;
the judging submodule is used for judging whether the data volume of the voice format data block is lower than the preset threshold value or not;
and the determining submodule is used for determining the next voice format data as the current voice format data under the condition that the data quantity of the voice format data block is lower than the preset threshold value.
9. The apparatus of claim 6, wherein after the dividing unit, the apparatus further comprises:
and the storage module is used for storing a target data block obtained by dividing the data with the data type as the target data type in a target database.
10. The apparatus of claim 9, wherein after the storing module, the apparatus further comprises:
and the index module is used for setting a target index mode for the data with the data type as the target data type in the target database.
CN201710106370.7A 2017-02-24 2017-02-24 Data processing method and device Active CN107016050B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710106370.7A CN107016050B (en) 2017-02-24 2017-02-24 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710106370.7A CN107016050B (en) 2017-02-24 2017-02-24 Data processing method and device

Publications (2)

Publication Number Publication Date
CN107016050A CN107016050A (en) 2017-08-04
CN107016050B true CN107016050B (en) 2019-12-20

Family

ID=59440506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710106370.7A Active CN107016050B (en) 2017-02-24 2017-02-24 Data processing method and device

Country Status (1)

Country Link
CN (1) CN107016050B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113554513A (en) * 2017-11-28 2021-10-26 创新先进技术有限公司 Data processing method, device and system
CN108248641A (en) * 2017-12-06 2018-07-06 中国铁道科学研究院电子计算技术研究所 A kind of urban track traffic data processing method and device
CN110446228B (en) * 2019-08-13 2022-02-22 腾讯科技(深圳)有限公司 Data transmission method, device, terminal equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102904738A (en) * 2011-07-26 2013-01-30 华为软件技术有限公司 Work order processing method, relevant device and relevant system
CN102957544A (en) * 2011-08-17 2013-03-06 中国移动通信集团上海有限公司 Method and device for transmitting service work orders and service work order processing system
CN103714812A (en) * 2013-12-23 2014-04-09 百度在线网络技术(北京)有限公司 Voice identification method and voice identification device
CN104869006A (en) * 2014-02-25 2015-08-26 中国移动通信集团上海有限公司 Data service automatic activation method and platform

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102904738A (en) * 2011-07-26 2013-01-30 华为软件技术有限公司 Work order processing method, relevant device and relevant system
CN102957544A (en) * 2011-08-17 2013-03-06 中国移动通信集团上海有限公司 Method and device for transmitting service work orders and service work order processing system
CN103714812A (en) * 2013-12-23 2014-04-09 百度在线网络技术(北京)有限公司 Voice identification method and voice identification device
CN104869006A (en) * 2014-02-25 2015-08-26 中国移动通信集团上海有限公司 Data service automatic activation method and platform

Also Published As

Publication number Publication date
CN107016050A (en) 2017-08-04

Similar Documents

Publication Publication Date Title
CN110019396B (en) Data analysis system and method based on distributed multidimensional analysis
US8874600B2 (en) System and method for building a cloud aware massive data analytics solution background
CN106030573B (en) Implementation of semi-structured data as first-level database element
US9953102B2 (en) Creating NoSQL database index for semi-structured data
CN111382226B (en) Database query and retrieval method and device and electronic equipment
CN103778148B (en) Life cycle management method and equipment for data file of Hadoop distributed file system
CN107408114B (en) Identifying join relationships based on transactional access patterns
JP6928677B2 (en) Data processing methods and equipment for performing online analysis processing
CN108268565B (en) Method and system for processing user browsing behavior data based on data warehouse
CN103838867A (en) Log processing method and device
CN107016050B (en) Data processing method and device
CN109815254B (en) Cross-region task scheduling method and system based on big data
CN106960020B (en) A kind of method and apparatus creating concordance list
CN108509437A (en) A kind of ElasticSearch inquiries accelerated method
CN112825069A (en) Method, device and system for analyzing database data and storage medium
CN110968579A (en) Execution plan generation and execution method, database engine and storage medium
CN107798120B (en) Data conversion method and device
CN107491558B (en) Metadata updating method and device
CN113177090A (en) Data processing method and device
WO2017092444A1 (en) Log data mining method and system based on hadoop
CN106780157B (en) Ceph-based power grid multi-temporal model storage and management system and method
CN116166191A (en) Integrated system of lake and storehouse
CN108182204A (en) The processing method and processing device of data query based on house prosperity transaction multi-dimensional data
CN115168396A (en) Comprehensive intelligent platform data management method and system based on spatio-temporal system
CN111177244A (en) Data association analysis method for multiple heterogeneous databases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant