CN113849464A

CN113849464A - Information processing method and apparatus

Info

Publication number: CN113849464A
Application number: CN202111152598.2A
Authority: CN
Inventors: 鲍金玉; 杨晨光; 王萌
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2021-09-29
Filing date: 2021-09-29
Publication date: 2021-12-28

Abstract

The embodiment of the application discloses an information processing method and equipment, wherein the method comprises the following steps: acquiring a first service file set; each service file in the first service file set is generated under the condition of processing a specific type of service; analyzing the service information in each service file in the first service file set, and determining at least one service index information corresponding to the first service file set; determining at least one index range information respectively corresponding to the at least one service index information; the at least one index range information is used for analyzing the service information of each service file in the second service file set to be analyzed, and determining an abnormal service file in the second service file set.

Description

Information processing method and apparatus

Technical Field

The embodiment of the application relates to, but not limited to, the technical field of electronics and information, and in particular relates to an information processing method and device.

Background

A large number of service files exist in the service system, and some abnormal files exist in the service files, for example, files with incomplete information, files with missing information, or dummy files.

In the related art, a worker is required to check a large number of business files one by one so as to determine abnormal files in the business files, however, the mode of determining the abnormal files is not only inefficient, but also wastes a large amount of human resources.

Disclosure of Invention

The embodiment of the application provides an information processing method and equipment.

An embodiment of the present application provides an information processing method, including:

acquiring a first service file set; each service file in the first service file set is generated under the condition of processing a specific type of service;

analyzing the service information in each service file in the first service file set, and determining at least one service index information corresponding to the first service file set;

determining at least one index range information respectively corresponding to the at least one service index information; the at least one index range information is used for analyzing the service information of each service file in the second service file set to be analyzed, and determining an abnormal service file in the second service file set.

In some embodiments, the first set of business files comprises a first set of partial business files; the first part of service file set comprises a plurality of sub service file sets which are respectively obtained from a plurality of data sources;

the analyzing the service information in each service file in the first service file set to determine at least one service index information corresponding to the first service file set includes:

determining a plurality of service index information corresponding to each sub-service file set;

and determining the at least one service index information based on the plurality of service index information and the attribute information of the data source corresponding to each sub service file set.

In some embodiments, the first set of business files comprises a second set of partial business files;

extracting keywords from the service information of each service file in the second part of service file set to obtain a keyword set;

and determining the at least one service index information based on the keywords of which the occurrence times are greater than a first threshold value in the keyword set.

In some embodiments, the first set of business files comprises a third set of partial business files;

extracting a first feature vector from the service information of each service file in the third part of service file set to obtain a feature vector set;

clustering the characteristic vector set to obtain a first clustering result;

and determining the at least one service index information based on the keywords corresponding to the service files in each category in the first clustering result.

In some embodiments, the determining at least one index range information corresponding to the at least one service index information respectively includes:

determining a target service file set corresponding to each service index information in the at least one service index information; the target business file set is included in the first business file set;

determining the ratio of the number of the service files in the target service file set to the number of the service files in the first service file set;

determining statistical parameters of the business files in the target business file set;

determining the at least one indicator range information based on the ratio and/or the statistical parameter.

determining at least one first range information respectively corresponding to the at least one service index information;

acquiring at least one second range information which is specified in advance;

determining a union of the at least one first range information and the at least one second range information as the at least one index range information.

and responding to a modification operation for modifying the target range information in the at least one first range information to obtain the at least one index range information.

In some embodiments, after determining at least one indicator range information corresponding to the at least one service indicator information, the method further comprises:

determining a second feature vector of each service file in the second service file set;

performing dimension reduction on the second eigenvector of each service file to obtain a third eigenvector of each service file;

clustering the third feature vectors of each service file to obtain a second clustering result;

and marking the abnormal service file in the displayed second clustering result.

In some embodiments, the method further comprises:

determining a target feature vector corresponding to the abnormal service file from the third feature vector of each service file;

acquiring at least one adjacent characteristic vector of the target characteristic vector;

determining a specific service file corresponding to the at least one adjacent feature vector from the second service file set;

and determining the object associated with the specific service file as an object to be concerned.

An embodiment of the present application provides an information processing apparatus, including: a memory and a processor, wherein the processor is capable of,

the memory stores a computer program operable on the processor,

the processor executes the computer program with the steps of the method described above.

In the embodiment of the application, the service information in each service file in the first service file set is analyzed to determine the at least one piece of service index information corresponding to the first service file set, and then the at least one piece of index range information corresponding to the at least one piece of service index information is determined, so that the index range information can be automatically determined through the service files in the first service file set obtained historically, and further whether the obtained service files are abnormal service files can be determined by adopting the index range information, so that the efficiency of determining the abnormal files is high, and the waste of human resources is reduced.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application.

Fig. 1 is a schematic flow chart illustrating an implementation of an information processing method according to an embodiment of the present application;

fig. 2 is a schematic flow chart illustrating an implementation of another information processing method according to an embodiment of the present application;

fig. 3 is a schematic flow chart illustrating an implementation of another information processing method according to an embodiment of the present application;

fig. 4 is a schematic flow chart illustrating an implementation of another information processing method according to an embodiment of the present application;

fig. 5 is a schematic flow chart illustrating an implementation of an information processing method according to another embodiment of the present application;

fig. 6 is a schematic structural diagram of an information processing apparatus according to an embodiment of the present application;

fig. 7 is a hardware entity diagram of an information processing apparatus according to an embodiment of the present application.

Detailed Description

The technical solution of the present application will be specifically described below by way of examples with reference to the accompanying drawings. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.

It should be noted that: in the present examples, "first", "second", etc. are used for distinguishing similar objects and are not necessarily used for describing a particular order or sequence.

The technical means described in the embodiments of the present application may be arbitrarily combined without conflict. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

In the embodiment of the present application, a specific type of service is an after-sales service to exemplify an information processing method, and it should be noted that, in a case where the specific type of service is another service, an example of the after-sales service may be referred to.

In a service system, a large number of commercial after-sale maintenance service stations exist, and service stations employ engineers to provide actual after-sale maintenance service, but false order opening behaviors of different dimensions of suppliers, service stations or engineers exist in the process of actually providing the after-sale maintenance service, and the situations damage the operation development of a company and cause serious economic loss.

The method is characterized in that a manual screening mode is adopted to perform examination and screening aiming at the problem of false order opening in a service system, a supervisor collects a large amount of work order data of dimensionalities such as suppliers, service stations and engineers, calculation and evaluation are performed on the large amount of work order data by using condensed business indexes, relatively abnormal data indexes are deduced according to the data distribution condition of the total work order data on each business index, and finally the service stations or the engineers with illegal cheating behaviors such as false order opening are screened out from the relatively abnormal data.

However, the solution of manual screening requires that the supervisor has a deep level of knowledge on the aspects of business system, supply chain, channel structure, operation system, etc., and has extremely high requirements on the factors such as working experience, industry experience, and the age limit of working in related fields of the supervisor. The method for constructing the manual screening of the service indexes highly depends on the service level and the experience level of a service analysis expert, and the applicability and the effectiveness of the evaluation indexes are directly related to the industrial experience level of a supervisor. Index data used for illegal screening seriously depends on field experience, reusability and expansibility of field knowledge are low, knowledge migration and application of an approximate application scene are difficult to perform by referring to the existing experience, and the application mode of manually constructing the index data for screening is higher in the difficulty of beginners.

Fig. 1 is a schematic flow chart of an implementation of an information processing method provided in an embodiment of the present application, and as shown in fig. 1, the method is applied to an information processing apparatus, and the method includes:

s101, acquiring a first service file set; each service file in the first service file set is generated under the condition of processing a specific type of service.

The information processing apparatus may include a processor, a cluster of processors, a chip, or a cluster of chips, so that the processor, the cluster of processors, the chip, or the cluster of chips perform the information processing method in the embodiment of the present application. In other embodiments, an information processing apparatus may perform the information processing method in the embodiments of the present application.

The information processing apparatus may include one of the following or a combination of at least two of the following: wearable devices such as a server, a Mobile Phone (Mobile Phone), a tablet personal computer (Pad), a computer with a wireless transceiving function, a palm computer, a desktop computer, a personal digital assistant, a portable media player, an intelligent sound box, a navigation device, an intelligent watch, intelligent glasses, an intelligent necklace and the like, and a pedometer, digital TV, Virtual Reality (VR) terminal device, Augmented Reality (AR) terminal device, wireless terminal in Industrial Control (Industrial Control), wireless terminal in unmanned Driving (Self Driving), wireless terminal in Remote Medical Surgery (Remote Surgery), wireless terminal in Smart Grid, wireless terminal in Transportation Safety (Transportation Safety), wireless terminal in Smart City (Smart City), wireless terminal in Smart Home (Smart Home), car in car networking system, car-mounted device, car-mounted module, and the like.

In some embodiments, the specific type of traffic may be a predefined traffic type. In other embodiments, the information processing apparatus may receive an operation of a user for a specific type of service among the displayed plurality of types of services, and determine the specific type of service.

The specific type of traffic may include at least one of: a service (e.g., at least one of an after-sales service, an in-sales service, a pre-sales service, a consulting service, etc.), a ticketing service, a questionnaire survey service, etc.

All the service files in the first service file set may have labels of abnormal service files, and the labels of abnormal service files may be calibrated in advance. In other embodiments, a part of the service files in the first service file set may have a label of whether the service files are abnormal service files, and another part of the service may not have a label of whether the service files are abnormal service files.

In some implementations, in the case of processing an after-market service, for example, an engineer may fill in information on a work order, for example, at least one of the engineer, a trouble description, a repaired component, a repair cost, a product model, a trouble image, etc., after repairing a product, so as to generate a service work order, which is a service file. In other implementation scenarios, one service file may be a ticket file such as an invoice generated in the context of processing a particular type of service (e.g., a hotel invoice, a bus invoice, a shopping invoice, etc.). In still other implementations, a business file may be a questionnaire generated in the context of processing a questionnaire survey business. The embodiment of the application does not limit how the first service file is obtained, and only needs the first service file to be capable of representing and processing the specific service type.

The first set of business files may include a plurality of sets of sub-business files, sub-business files in different sets of sub-business files being generated in the event of different specific types of business being processed, and/or sub-business files in different sets of sub-business files being generated at different geographic locations (e.g., different administrative regions).

S102, analyzing the service information in each service file in the first service file set, and determining at least one service index information corresponding to the first service file set.

The service information may be all or part of the information in the service file. For example, the service information may be service-related information in a service file. Exemplarily, in the case of taking the service file as the service work order, the service information may include at least one of the following: engineer, trouble shooting, repair parts. Further exemplarily, in case that the service file is a ticket file, the service information may include at least one of: the uploader of the bill file, the buyer information, the seller information, the goods or taxable labor, the service name, etc. in the bill file.

Illustratively, the at least one traffic indicator information may comprise at least one of: the engineer corresponds to the proportion of each device (for example, the proportion of the engineer corresponding to a mobile phone, the proportion of the engineer corresponding to a computer, and the like), the proportion of a business work order guaranteed by an invoice, the proportion of multiple uses of the same invoice, the frequency of faults of a certain part of a product, the frequency of maintenance of a certain product by a service station, and the like.

At least one service index information is used for paying attention to each index in the first service file set, because under the condition that relevant personnel need to confirm abnormal service files, how to select the service index information for evaluation is not known at all. In the embodiment of the application, by determining the at least one service index information corresponding to the first service file set, the determination of the at least one service index information can correspond to the service information of the first service file set, and the pertinence of the determined at least one service index information is improved.

S103, determining at least one index range information corresponding to the at least one service index information respectively; the at least one index range information is used for analyzing the service information of each service file in the second service file set to be analyzed, and determining an abnormal service file in the second service file set.

The at least one service indicator information may have a one-to-one correspondence with the at least one indicator range information. For example, when a certain service index information is a, the index range corresponding to the service index information may be a > α and a ≦ β, where α is greater than β. For another example, when a certain service index information is B, the index range corresponding to the service index information may be γ ≦ B ≦ δ.

In some embodiments, the at least one business indicator information may be displayed to enable a user to determine the at least one indicator range information and to enable the information processing apparatus to obtain the at least one indicator range information by inputting the at least one indicator range information to the information processing apparatus. In this case, the user may determine index range information corresponding to each service index information.

In other embodiments, the at least one index range information may be determined based on a distribution of the at least one service index information in the first set of service files. For example, by determining which service files in the first service file set are associated with the service index information a, which service files in the first service file set are associated with the service index information B, and so on, the service files in the first service file set associated with each service index information in the at least one service index information can be determined, so that the at least one index range information is determined based on the statistical parameters of the service files in the first service file set associated with each service index information. Wherein the statistical parameter may include at least one of: a value obtained by at least one of addition, subtraction, multiplication, division, exponential operation, logarithmic operation, mean, median, mode, minimum, maximum, variance, standard deviation, quantile, and the like.

In some embodiments, the at least one index range information determined by the information processing device may be at least one index range information corresponding to a normal service file, so that, when the second service file set is analyzed by using the at least one index range information, the service index information of each file in the second service file set may be determined, and a service file within the at least one index range information may be determined as a normal service file; and determining the service file which is out of the information of the at least one index range as an abnormal service file.

In other embodiments, the at least one indicator range information determined by the information processing device may include at least one indicator range information corresponding to a normal service file and at least one indicator range information corresponding to an abnormal service file. Thus, the service file in at least one index range information corresponding to the normal service file is determined as the normal service file; determining the service file in at least one index range information corresponding to the abnormal service file as the abnormal service file; and determining the color business file which is out of the at least one index range information corresponding to the normal business file and out of the at least one index range information corresponding to the abnormal business file as the business file to be concerned. The business file to be attended may be determined as a normal business file or an abnormal business file by other means, or the business file to be attended may be displayed so that the relevant staff may determine as a normal business file or an abnormal business file.

The second set of business files may include a plurality of business files, and the index value of at least one business file associated with each object (e.g., engineer, service station or provider, etc.) in the second set of business files may be matched with at least one index range information, and if the index value is within the corresponding one index range information, it indicates that the associated at least one business file is a normal business file.

For example, the at least one index range information may include: the proportion of an engineer for maintaining the main board is in a range larger than 0.7, if the proportion of maintaining the main board in the second service file set is 0.8 in the service files of the Zhang III related second service file set, the service files in the Zhang III related second service file set are normal service files; and if the proportion of maintaining the mainboard is 0.5 in the service files in the second service file set related to Zhang III, indicating that the service files in the second service file set related to Zhang III are abnormal service files.

In some embodiments, the first service file set may include a plurality of different service file sets, and the different service file sets determine the at least one service index information corresponding to the different service file sets in different manners. For example, the first set of business files may include at least one of: the service file collection system comprises a first part of service file collection, a second part of service file collection and a third part of service file collection.

The following describes a manner of determining at least one service index information corresponding to the first part of service file set:

the first service file set comprises a first part of service file set; the first part of service file set comprises a plurality of sub service file sets which are respectively obtained from a plurality of data sources; the analyzing the service information in each service file in the first service file set to determine at least one service index information corresponding to the first service file set includes: determining a plurality of service index information corresponding to each sub-service file set; and determining the at least one service index information based on the plurality of service index information and the attribute information of the data source corresponding to each sub service file set.

The different ones of the plurality of data sources may include at least one of the following: different suppliers, different service stations, different service areas (e.g., different service provinces).

In some embodiments, determining a plurality of service index information corresponding to each of the sets of sub-service files may include: analyzing the service information in each service file in each sub-service file set, and determining a plurality of service index information corresponding to each sub-service file set. In other embodiments, determining a plurality of service index information corresponding to each of the sub-service file sets may include: and manually analyzing the service files in each sub-service file set to determine a plurality of service index information corresponding to each sub-service file set.

The attribute information of the data source may include: the system comprises a supplier corresponding to a data source, a service station corresponding to the data source and a service area corresponding to the data source.

The at least one service indicator information may include: at least one piece of business index information corresponding to each service area in the plurality of service areas is different from at least one piece of business index information corresponding to different service areas; or may include: at least one piece of business index information corresponding to each service station in the plurality of service stations is different from at least one piece of business index information corresponding to different service stations.

Determining the at least one service indicator information may include: and determining at least one service index information corresponding to each attribute information in the attribute information of different data sources. In the case that one attribute information corresponds to at least two sub-service file sets, a plurality of service index information respectively corresponding to the at least two sub-service file sets may be determined, and then a union of the plurality of service index information respectively corresponding to the at least two sub-service file sets is determined as at least one service index information corresponding to the attribute information.

In the embodiment of the application, available data analysis evaluation dimension and other information are obtained by using data existing in a current system for processing and analysis, large data statistical analysis can be performed on large-batch data aiming at the condition of large scale of a data entity, a plurality of different data source data are collected and integrated, a distributed database or a distributed computing cluster is used for performing common analysis, classification and summarization and other processing on stored massive structured and unstructured data, and key information contained in the data is extracted to meet actual analysis and use requirements.

The following describes a manner of determining at least one service index information corresponding to the second service file set:

the first service file set comprises a second part of service file set; the analyzing the service information in each service file in the first service file set to determine at least one service index information corresponding to the first service file set includes: extracting keywords from the service information of each service file in the second part of service file set to obtain a keyword set; and determining the at least one service index information based on the keywords of which the occurrence times are greater than a first threshold value in the keyword set.

In some embodiments, a Natural Language Processing (NLP) strategy may be adopted to extract keywords from the service information of each service file, so as to obtain a keyword set.

The keywords may be information related to parts of the product. For example, the keyword may be a motherboard, a display screen, a keyboard, and the like. In some embodiments, the service information of a certain service file includes: and (4) when the mainboard fails, replacing the mainboard, and taking the key words extracted from the service information of the service file as the mainboard.

In some embodiments, there are 50 keywords a, 20 keywords B, and 5 keywords C in the keyword set, and in the case that the first threshold is 10, at least one service index information may be determined based on the keywords a and the keywords B. In some embodiments, the size of the first threshold may be related to the number of service files in the first service file set, for example, the ratio of the first threshold to the number of service files in the first service file set may be a fixed value. In other embodiments, the first threshold may be a fixed value.

In some embodiments, the keyword whose occurrence number is greater than the first threshold may be directly determined as the at least one service index information. In other embodiments, the at least one business indicator information may be determined based on the keyword having a number of occurrences greater than the first threshold and at least one of a corresponding engineer, service station, provider, service area. For example, the keyword greater than the first threshold is a motherboard, and the at least one piece of service indicator information may be at least one of a motherboard associated with a service area, a motherboard associated with an engineer, a motherboard associated with a service station, and a motherboard associated with a provider.

In the embodiment of the application, the required key information is extracted by using a natural language processing strategy, text information such as maintenance records, work order records and the like in actual data is used as original data and is further processed and analyzed, and the conventional text information cannot be directly used for machine learning and can be converted into a language form which can be recognized by a machine through certain preprocessing operation. Typical steps of adopting a natural language processing strategy are word segmentation, word stem extraction, part of speech tagging, stop word removal and other processing steps, and keywords are extracted from the operation as dimension characteristics required actually.

The following describes a manner of determining at least one service index information corresponding to the third service file set:

the first service file set comprises a third part of service file set; the analyzing the service information in each service file in the first service file set to determine at least one service index information corresponding to the first service file set includes: extracting a first feature vector from the service information of each service file in the third part of service file set to obtain a feature vector set; clustering the characteristic vector set to obtain a first clustering result; and determining the at least one service index information based on the keywords corresponding to the service files in each category in the first clustering result.

In some embodiments, a first feature vector corresponding to the traffic information may be determined, resulting in a set of feature vectors. In other embodiments, the service information of each service file may be subjected to keyword extraction to obtain a keyword corresponding to each service file, so that a first feature vector of each service file is obtained based on the keyword corresponding to each service file to obtain a feature vector set.

The dimensionality of each vector in the feature vector set is the same, the dimensionality can be a preset value, and under the condition that the dimensionality of the vector obtained based on the service information or the keywords is not the preset value, dimensionality compression or dimensionality expansion can be performed on the obtained vector, so that the dimensionality of the vector reaches the preset value.

Determining the at least one service index information based on the keyword corresponding to the service file in each category in the first clustering result may include: extracting keywords from each service file under each category to obtain a plurality of keywords corresponding to each category, and determining the keywords with the occurrence times larger than a second threshold value in the plurality of keywords as service index information corresponding to each category, so as to determine the keywords as at least one service index information based on the union of the service index information corresponding to each category.

In the embodiment of the application, the key information is acquired by performing unsupervised analysis processing on the data. The cluster analysis based on unsupervised learning can find the rules and modes of the data, and compared with supervised learning, unsupervised data does not need to be marked, so that a large amount of manpower and material resource cost can be saved. The cluster hidden in the data can be found through cluster analysis, outlier data can be differentiated, data dimension reduction can be performed on the data with high-dimensional features, and main features in the data can be extracted.

Effective information is extracted from the existing data by using strategies such as data processing analysis and the like to serve as dimensional characteristics, and the extracted visualized dimensional characteristics serve as integrity assessment indexes (and at least one piece of business index information).

Fig. 2 is a schematic flow chart of another implementation of an information processing method provided in an embodiment of the present application, and as shown in fig. 2, the method is applied to an information processing apparatus, and the method includes:

s201, acquiring a first service file set; each service file in the first service file set is generated under the condition of processing a specific type of service.

S202, analyzing the service information in each service file in the first service file set, and determining at least one service index information corresponding to the first service file set.

S203, determining a target service file set corresponding to each service index information in the at least one service index information; the target set of business files is included in the first set of business files.

S204, determining the ratio of the number of the service files in the target service file set to the number of the service files in the first service file set.

For example, if a certain service index information is a motherboard corresponding to engineer lii, a target service file set related to the motherboard corresponding to engineer lii may be determined.

S205, determining the statistical parameters of the business files in the target business file set.

The statistical parameters may include at least one of: a value obtained by at least one of addition, subtraction, multiplication, division, exponential operation, logarithmic operation, mean, median, mode, minimum, maximum, variance, standard deviation, quantile, and the like.

S206, determining the at least one index range information based on the ratio and/or the statistical parameter.

The embodiment of the present application does not limit the manner of determining the at least one index range information based on the ratio and/or the statistical parameter, and any manner of determining the at least one index range information based on the ratio and/or the statistical parameter should be within the scope of the present application. For example, based on the minimum value and the maximum value, at least one index range information is determined; alternatively, at least one index range information is determined based on the ratio, or at least one index range information is determined based on the ratio and the average, and the like.

In the embodiment of the application, on the basis that the dimensional characteristics are obtained as integrity evaluation indexes (corresponding to at least one piece of business index information), data can be integrated and collected according to the dimensional characteristics, collected original data is calculated according to different indexes, and actual distribution, occupation ratio and the like of the original data in each index data are calculated. And calculating statistical dimensions according to different index data, and calculating data such as the sum, mean, extreme value, standard deviation and the like of each index data. According to the distribution performance of each index data in the statistical dimension, the index data are analyzed according to the specific conditions such as the index data proportion, and reasonable index evaluation parameters (corresponding to one less index range information) are obtained.

The service index data is generated by calculating and collecting all the dimensional characteristics, the statistical measure data of all the indexes is measured by performing statistical calculation on all the service index data, and the evaluation parameters of all the index data are further set, so that the subsequent integrity analysis can be performed according to the design of the index analysis.

Fig. 3 is a schematic flow chart of an implementation of another information processing method provided in an embodiment of the present application, and as shown in fig. 3, the method is applied to an information processing apparatus, and the method includes:

s301, acquiring a first service file set; each service file in the first service file set is generated under the condition of processing a specific type of service.

S302, analyzing the service information in each service file in the first service file set, and determining at least one service index information corresponding to the first service file set.

S303, determining at least one first range information corresponding to the at least one service index information respectively.

S304, at least one piece of second range information which is preset is obtained.

S305, determining a union of the at least one first range information and the at least one second range information as the at least one index range information.

Fig. 4 is a schematic flow chart of an implementation of another information processing method provided in an embodiment of the present application, and as shown in fig. 4, the method is applied to an information processing apparatus, and the method includes:

s401, acquiring a first service file set; each service file in the first service file set is generated under the condition of processing a specific type of service.

S402, analyzing the service information in each service file in the first service file set, and determining at least one service index information corresponding to the first service file set.

S403, determining at least one first range information corresponding to the at least one service index information respectively.

S404, responding to the modification operation of modifying the target range information in the at least one first range information to obtain the at least one index range information.

In this way, after the at least one piece of first range information is obtained, the at least one piece of first range information can be displayed, so that unreasonable target range information in the at least one piece of first range information is manually determined, the target range information is modified, the at least one piece of index range information is obtained, and the accuracy of the determined at least one piece of index range information is improved.

In the embodiment of the application, integrity evaluation can be performed on specific data by using the judgment index data (corresponding to at least one piece of service index information) generated by the previous item and the reasonable index evaluation parameter (corresponding to at least one piece of index range information) obtained by index analysis, wherein data within the limited range of the index evaluation parameter is normal data under the index, and data distributed outside the limited range of the index evaluation parameter is abnormal data under the index.

In the integrity judgment analysis, the conventional business analysis rule can be included to supplement and strengthen the integrity analysis, and the applicable business judgment criterion is introduced through tests such as actual measurement contrast and the like. After the integrity analysis and calculation evaluation result, a business analysis expert can be introduced to evaluate and judge the analysis result, and the accuracy of the analysis result is improved by introducing the analysis measure of the business expert.

Fig. 5 is a schematic flow chart of an implementation of an information processing method according to another embodiment of the present application, and as shown in fig. 5, the method is applied to an information processing apparatus, and the method includes:

s501, acquiring a first service file set; each service file in the first service file set is generated under the condition of processing a specific type of service.

S502, analyzing the service information in each service file in the first service file set, and determining at least one service index information corresponding to the first service file set.

S503, determining at least one index range information corresponding to the at least one service index information respectively; the at least one index range information is used for analyzing the service information of each service file in the second service file set to be analyzed, and determining an abnormal service file in the second service file set.

S504, determining a second feature vector of each service file in the second service file set.

In some embodiments, the feature vector corresponding to the service information of each service file may be determined as the second feature vector of each service file. In other embodiments, the service information of each service file may be subjected to keyword extraction to obtain a keyword corresponding to each service file, so that the second feature vector of each service file is obtained based on the keyword corresponding to each service file.

And S505, performing dimension reduction on the second feature vector of each service file to obtain a third feature vector of each service file.

The dimension of the third feature vector may be two-dimensional or three-dimensional so that the third feature vector of each business file can be easily visualized.

S506, clustering the third feature vectors of each service file to obtain a second clustering result.

S507, marking the abnormal service file in the displayed second clustering result.

In the embodiment of the application, the multidimensional or high-dimensional data indexes are reduced, and the data are quantized and mapped in a three-dimensional space or a two-dimensional plane space for representation. And carrying out visual display by using the quantized data of the three-dimensional space or the two-dimensional plane space, thereby obtaining the overall distribution situation of the overall data in the three-dimensional space or the two-dimensional plane space.

In some embodiments, the information processing method may further include: determining a target feature vector corresponding to the abnormal service file from the third feature vector of each service file; acquiring at least one adjacent characteristic vector of the target characteristic vector; determining a specific service file corresponding to the at least one adjacent feature vector from the second service file set; and determining the object associated with the specific service file as an object to be concerned.

The object of interest may be an engineer, a service station, a provider, a service area, etc.

The target feature vector may include one or more vectors, the neighboring feature vector may be a neighboring vector of each of the one or more vectors, and the number of neighboring vectors of each vector may be at least one. The number of neighboring vectors per vector may be predetermined, or the number of neighboring vectors per vector may not be predetermined, but may be determined according to the characteristic distance from each vector to the other vectors, for example, the greater the characteristic distance smaller than a set value, the greater the number of neighboring vectors may be determined, and the smaller the characteristic distance smaller than the set value, the smaller the number of neighboring vectors may be determined.

In the embodiment of the application, the quantized data of the three-dimensional space or the two-dimensional plane space is utilized, the abnormal data in the integrity analysis result is combined to serve as seed data, and the suspected individual data is calculated and mined from the neighbor relation between the quantized individual of the three-dimensional space or the two-dimensional plane space and the seed data through measuring and calculating, so that the integrity analysis result is more comprehensive and reliable.

In the embodiment of the application, the assessment indexes are constructed by analyzing and extracting the features through data, the data to be assessed are analyzed and assessed based on the attribute indexes of different features to generate analysis result data, the dimension reduction is performed on the condition that the dimension of the data feature is high to generate a visual graphic result, and the existing labeled data and analysis result are combined with the analysis dimensions such as the three-dimensional space distance to be associated with suspected cheaters, so that the high suspected cheaters can be further mined. By adopting the semi-supervised anti-cheating application solution based on machine learning, the existing data can be effectively utilized to construct features, and the existing resources are fully and efficiently utilized; due to the adoption of a semi-supervised learning mode, the learner does not depend on external interaction and automatically utilizes unlabeled sample data to improve the final cheating identification performance; the solution can powerfully assist business analysis experts in performing decision-making judgment or undertaking the work function of manual screening, and the combination of a computer and an algorithm can greatly improve the work efficiency, so that the input cost of manpower is further reduced, and the cost is reduced and the efficiency is improved; the solution is convenient for realizing algorithm adaptation and migration application of different application scenes in a data and specific algorithm model adjusting mode.

In the embodiment of the application, key semantic information is mined based on data such as service work order data and maintenance records and the like and technical processing means such as natural language processing, big data analysis and generalized text clustering analysis are combined, main information in the data is extracted to be used as dimension characteristics for data analysis, massive data are prevented from being controlled through a manual screening mode, and high dependence on experience of supervision practitioners is effectively avoided.

The evaluation index construction and design are free from dependence on manual work by collecting, sorting and extracting key information from the existing data, the full information can be more comprehensively mined by adding technologies such as big data and natural language processing, and the problems of lack of effective indexes, omission and the like caused by field cognition limitation in manual screening are avoided.

The effectiveness and the applicability of the evaluation indexes are related to the quality of data used for generating the indexes and algorithm processing logic, and the effectiveness and the applicability of an actual scene of the evaluation indexes can be improved through good data preprocessing operation and a mature algorithm solution.

The form of combining the data and the algorithm can have stronger scene applicability and transferability, the application scene can be adjusted and transferred according to actual use requirements, the actual transfer adjustment can be realized only by preprocessing the actually applied data and optimizing and adjusting the algorithm, and the actual application conversion and expansion are simpler and more convenient.

Main component key information in the dimension data is extracted through high-dimension data dimension reduction compression processing, the separated key information is utilized to construct distribution presentation of actual data in a three-dimensional space or a two-dimensional plane space, and the overall distribution of the actual data can be perceived more visually through data graphical presentation.

The data after dimensionality reduction can be subjected to space near-distance correlation in a three-dimensional space or a plane space, and suspected integrity violation cheating records are further mined through the characteristic of similar distribution of the same attribute or class space, so that screening omission of poor recorded data is avoided.

The use scene can be visually customized according to actual business requirements in an applied reality scene, the distribution situation of the compressed dimension characteristics in the space is visually displayed, and the overall distribution state, aggregation state and free state of the measured data are visually displayed through a three-dimensional or planar coordinate system, so that more visual cognition can be generated on normal data and abnormal data.

Based on the foregoing embodiments, the present application provides an information processing apparatus, which includes units included and modules included in the units, and can be implemented by a processor in an information processing device; of course, it may be implemented by a specific logic circuit.

Fig. 6 is a schematic diagram of a composition structure of an information processing apparatus according to an embodiment of the present application, and as shown in fig. 6, an information processing apparatus 600 includes:

an obtaining unit 601, configured to obtain a first service file set; each service file in the first service file set is generated under the condition of processing a specific type of service;

an analyzing unit 602, configured to analyze service information in each service file in the first service file set, and determine at least one service index information corresponding to the first service file set;

a determining unit 603, configured to determine at least one indicator range information corresponding to the at least one service indicator information, respectively; the at least one index range information is used for analyzing the service information of each service file in the second service file set to be analyzed, and determining an abnormal service file in the second service file set.

In some embodiments, the first set of business files comprises a first set of partial business files; the first part of service file set comprises a plurality of sub service file sets which are respectively obtained from a plurality of data sources; an analyzing unit 602, configured to determine a plurality of service index information corresponding to each of the sub-service file sets; and determining the at least one service index information based on the plurality of service index information and the attribute information of the data source corresponding to each sub service file set.

In some embodiments, the first set of business files comprises a second set of partial business files; the analysis unit 602 is further configured to extract a keyword from the service information of each service file in the second part of service file set, so as to obtain a keyword set; and determining the at least one service index information based on the keywords of which the occurrence times are greater than a first threshold value in the keyword set.

In some embodiments, the first set of business files comprises a third set of partial business files; the analyzing unit 602 is further configured to extract a first feature vector from the service information of each service file in the third part of service file set, so as to obtain a feature vector set; clustering the characteristic vector set to obtain a first clustering result; and determining the at least one service index information based on the keywords corresponding to the service files in each category in the first clustering result.

In some embodiments, the determining unit 603 is further configured to determine a target service file set corresponding to each service index information in the at least one service index information; the target business file set is included in the first business file set; determining the ratio of the number of the service files in the target service file set to the number of the service files in the first service file set; determining statistical parameters of the business files in the target business file set; determining the at least one indicator range information based on the ratio and/or the statistical parameter.

In some embodiments, the determining unit 603 is further configured to determine at least one first range information corresponding to the at least one service indicator information respectively; acquiring at least one second range information which is specified in advance; determining a union of the at least one first range information and the at least one second range information as the at least one index range information.

In some embodiments, the determining unit 603 is further configured to determine at least one first range information corresponding to the at least one service indicator information respectively; and responding to a modification operation for modifying the target range information in the at least one first range information to obtain the at least one index range information.

In some embodiments, the determining unit 603 is further configured to determine a second feature vector of each service file in the second service file set; performing dimension reduction on the second eigenvector of each service file to obtain a third eigenvector of each service file; clustering the third feature vectors of each service file to obtain a second clustering result; and marking the abnormal service file in the displayed second clustering result.

In some embodiments, the determining unit 603 is further configured to determine, from the third feature vector of each service file, a target feature vector corresponding to the abnormal service file; acquiring at least one adjacent characteristic vector of the target characteristic vector; determining a specific service file corresponding to the at least one adjacent feature vector from the second service file set; and determining the object associated with the specific service file as an object to be concerned.

The above description of the apparatus embodiments, similar to the above description of the method embodiments, has similar beneficial effects as the method embodiments. For technical details not disclosed in the embodiments of the apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.

It should be noted that, in the embodiment of the present application, if the information processing method is implemented in the form of a software functional module and sold or used as a standalone product, it may also be stored in a computer storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing an information processing apparatus to execute all or part of the methods described in the embodiments of the present application.

Fig. 7 is a schematic diagram of a hardware entity of an information processing apparatus according to an embodiment of the present application, and as shown in fig. 7, the hardware entity of the information processing apparatus 700 includes: a processor 701 and a memory 702, wherein the memory 702 stores a computer program operable on the processor 701, and the processor 701 implements the steps of the method of any of the above embodiments when executing the program.

The Memory 702 stores a computer program executable on the processor, and the Memory 702 is configured to store instructions and applications executable by the processor 701, and also buffers data (e.g., image data, audio data, voice communication data, and video communication data) to be processed or already processed by each module in the information processing apparatus 700 and the processor 701, and may be implemented by a FLASH Memory (FLASH) or a Random Access Memory (RAM).

The steps of the information processing method of any one of the above are implemented when the processor 701 executes a program. The processor 701 generally controls the overall operation of the information processing apparatus 700.

The present embodiment provides a computer storage medium, which stores one or more programs, where the one or more programs are executable by one or more processors to implement the steps of the information processing method according to any one of the above embodiments.

Here, it should be noted that: the above description of the storage medium and device embodiments is similar to the description of the method embodiments above, with similar advantageous effects as the method embodiments. For technical details not disclosed in the embodiments of the storage medium and apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.

The information processing apparatus, chip or processor described above may include an integration of any one or more of: an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), an embedded neural Network Processing Unit (NPU), a controller, a microcontroller, a microprocessor, a Programmable Logic Device, a discrete Gate or transistor Logic Device, and discrete hardware components. It is understood that the electronic device implementing the above-mentioned processor function may be other electronic devices, and the embodiments of the present application are not particularly limited.

The computer storage medium/Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read Only Memory (EPROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a magnetic Random Access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical Disc, or a Compact Disc Read-Only Memory (CD-ROM), and the like; but may also be various terminals such as mobile phones, computers, tablet devices, personal digital assistants, etc., that include one or any combination of the above-mentioned memories.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment of the present application" or "a previous embodiment" or "some implementations" or "some embodiments" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" or "an embodiment of the present application" or "the preceding embodiments" or "some embodiments" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application. The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

In a case where no specific description is given, the information processing apparatus may execute any step in the embodiment of the present application, and the processor of the information processing apparatus may execute the step. Unless otherwise specified, the embodiment of the present application does not limit the order in which the information processing apparatus performs the following steps. In addition, the data may be processed in the same way or in different ways in different embodiments. It should be further noted that any step in the embodiments of the present application may be executed by the information processing apparatus independently, that is, when the information processing apparatus executes any step in the above embodiments, the execution of other steps may not be relied on.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

In the description of the present application, it is to be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; may be mechanically connected, may be electrically connected or may be in communication with each other; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art as appropriate.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The methods disclosed in the several method embodiments provided in the present application may be combined arbitrarily without conflict to obtain new method embodiments.

Features disclosed in several of the product embodiments provided in the present application may be combined in any combination to yield new product embodiments without conflict.

The features disclosed in the several method or apparatus embodiments provided in the present application may be combined arbitrarily, without conflict, to arrive at new method embodiments or apparatus embodiments.

Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Read Only Memory (ROM), a magnetic disk, or an optical disk.

Alternatively, the integrated units described above in this application may be stored in a computer storage medium if they are implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof contributing to the related art may be embodied in the form of a software product stored in a storage medium, and including several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a magnetic or optical disk, or other various media that can store program code.

In the embodiments of the present application, the descriptions of the same steps and the same contents in different embodiments may be mutually referred to. In the embodiment of the present application, the term "and" does not affect the order of the steps, for example, the information processing apparatus executes a and executes B, where the information processing apparatus executes a first and then executes B, or the information processing apparatus executes B first and then executes a, or the information processing apparatus executes a and then executes B.

It is to be noted that the drawings in the embodiments of the present application are only for illustrating schematic positions of respective devices on an information processing apparatus and do not represent actual positions in the information processing apparatus, the actual positions of the respective devices or the respective areas may be changed or shifted accordingly depending on actual conditions (for example, the structure of the information processing apparatus), and the scale of different parts in the information processing apparatus in the drawings does not represent the actual scale.

As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

It should be noted that, in the embodiments related to the present application, all the steps may be executed or some of the steps may be executed, as long as a complete technical solution can be formed.

The above description is only for the embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. An information processing method comprising:

2. The method of claim 1, wherein the first set of service files comprises a first set of partial service files; the first part of service file set comprises a plurality of sub service file sets which are respectively obtained from a plurality of data sources;

3. The method of claim 1, wherein the first set of service files comprises a second set of partial service files;

4. The method of claim 1, wherein the first set of service files comprises a third set of partial service files;

clustering the characteristic vector set to obtain a first clustering result;

5. The method according to any one of claims 1 to 4, wherein the determining at least one index range information corresponding to the at least one service index information respectively comprises:

6. The method according to any one of claims 1 to 4, wherein the determining at least one index range information corresponding to the at least one service index information respectively comprises:

acquiring at least one second range information which is specified in advance;

7. The method according to any one of claims 1 to 4, wherein the determining at least one index range information corresponding to the at least one service index information respectively comprises:

8. The method according to any one of claims 1 to 4, wherein after determining at least one index range information corresponding to the at least one service index information, the method further comprises:

9. The method of claim 8, wherein the method further comprises:

10. An information processing apparatus comprising: a memory and a processor, wherein the processor is capable of,

the memory stores a computer program operable on the processor,

the processor, when executing the computer program, implements the steps of the method of any one of claims 1 to 9.