CN114358395A - Attendance checking prediction method and device - Google Patents

Attendance checking prediction method and device Download PDF

Info

Publication number
CN114358395A
CN114358395A CN202111550306.0A CN202111550306A CN114358395A CN 114358395 A CN114358395 A CN 114358395A CN 202111550306 A CN202111550306 A CN 202111550306A CN 114358395 A CN114358395 A CN 114358395A
Authority
CN
China
Prior art keywords
data
attendance
prediction
sample data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111550306.0A
Other languages
Chinese (zh)
Inventor
姚笛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Anjaxin Human Resources Co ltd
Original Assignee
Shanghai Anjaxin Human Resources Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Anjaxin Human Resources Co ltd filed Critical Shanghai Anjaxin Human Resources Co ltd
Priority to CN202111550306.0A priority Critical patent/CN114358395A/en
Publication of CN114358395A publication Critical patent/CN114358395A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/109Time management, e.g. calendars, reminders, meetings or time accounting
    • G06Q10/1091Recording time for administrative or management purposes

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the disclosure discloses an attendance checking prediction method and device, comprising the following steps: after attendance data of a large number of users are acquired from an attendance system and serve as sample data, the sample data are classified based on a preconfigured time interval and the time of card punching in the sample data; extracting data of a preset dimension from the sample data in each time interval as characteristic data; taking the characteristic data as the input of a pre-established model, and outputting the prediction scores of the sample data in different time intervals; and determining an attendance prediction result corresponding to the prediction score in different time intervals based on the prediction threshold corresponding to the different preconfigured time intervals and the prediction score, wherein the prediction result is used for indicating whether attendance is false. By extracting the characteristic data of the attendance data and training and adjusting the attendance model, the attendance false data can be predicted, and the technical problem that the false attendance data cannot be predicted in the related technology is solved.

Description

Attendance checking prediction method and device
Technical Field
The disclosure relates to the technical field of data processing, in particular to an attendance checking prediction method, an attendance checking prediction device and electronic equipment.
Background
Attendance checking usually has the behavior of attendance cheating, for example, attendance checking is performed, but there is no behavior of attendance in fact. For another example, the labor enterprise takes the employee out to the work organization for work, and the employee uses the attendance system of the labor enterprise to check the attendance, but does not go to the work organization.
In the related technology, the attendance false behavior cannot be predicted.
Disclosure of Invention
The main purpose of the present disclosure is to provide an attendance checking prediction method and device.
In order to achieve the above object, according to a first aspect of the present disclosure, there is provided an attendance prediction method, including: after attendance data of a large number of users are acquired from an attendance system and serve as sample data, the sample data are classified based on a preconfigured time interval and the time of card punching in the sample data; extracting data of a preset dimension from the sample data in each time interval as characteristic data; taking the characteristic data as the input of a pre-established model, and outputting the prediction scores of the sample data in different time intervals; and determining an attendance prediction result corresponding to the prediction score in different time intervals based on the prediction threshold corresponding to the different preconfigured time intervals and the prediction score, wherein the prediction result is used for indicating whether attendance is false.
Optionally, the method further comprises: judging whether the prediction result is correct or not based on the feedback data of the sample data; if the prediction is incorrect, adjusting the output prediction result; determining a prediction threshold value of the model based on all adjusted prediction results, wherein when the prediction score is larger than the prediction threshold value, the prediction result indicates that attendance is false; otherwise no artifacts exist.
Optionally, for any time interval, determining the dimension of the interval extraction data as the extraction basis of the feature data.
Optionally, the inputting the feature data as the pre-established model comprises: respectively taking the characteristic data as the input of a pre-established XGB model, a light TGBM model and a random forest model; carrying out model combination on results output by different models; and determining the final predicted value of each sample datum in a weight voting mode, wherein the predicted value is used for representing the probability of attendance false behavior.
Optionally, for the sample data in each time interval, extracting data of a preset dimension from the sample data as feature data includes: extracting motion data of the sample data during card punching aiming at the sample data in each time interval; and extracting the card punching place data of the sample data aiming at the sample data in each time interval.
Optionally, the method further comprises: and extracting the power consumption data of the equipment in the sample data aiming at the sample data in each time interval.
Optionally, the method further comprises: if the prediction threshold is adjusted, the attendance prediction result is changed based on the prediction threshold.
According to a second aspect of the present disclosure, there is provided an attendance prediction method, including: in response to receiving a card punching request of an employee, obtaining card punching data of the employee in real time; inputting the card punching data into a model, and outputting an attendance prediction result; and sending the attendance prediction result to a preset user side so as to verify the prediction result.
According to a third aspect of the present disclosure, there is also provided an apparatus for implementing the attendance checking prediction method, the apparatus including: the system comprises a preprocessing unit and a time-sharing unit, wherein the preprocessing unit is configured to classify sample data based on a preconfigured time interval and the time of punching a card in the sample data after the attendance data of a large number of users are acquired from an attendance system as the sample data; a feature extraction unit configured to extract data of a preset dimension as feature data from the sample data in each time interval; a model training unit configured to output the prediction scores of the sample data in different time intervals with the feature data as an input of a pre-established model; the model training unit is further configured to determine an attendance prediction result corresponding to the prediction score in different time intervals based on the prediction threshold and the prediction score corresponding to the different pre-configured time intervals, wherein the prediction result is used for indicating whether the attendance is false or not.
According to a fourth aspect of the present disclosure, there is provided a computer-readable storage medium storing computer instructions for causing a computer to execute the attendance prediction method according to any one of the implementation manners of the first aspect.
In the attendance prediction method and device of the embodiment of the disclosure, the method comprises the following steps: after attendance data of a large number of users are acquired from an attendance system and serve as sample data, the sample data are classified based on a preconfigured time interval and the time of card punching in the sample data; extracting data of a preset dimension from the sample data in each time interval as characteristic data; taking the characteristic data as the input of a pre-established model, and outputting the prediction scores of the sample data in different time intervals; and determining an attendance prediction result corresponding to the prediction score in different time intervals based on the prediction threshold corresponding to the different preconfigured time intervals and the prediction score, wherein the prediction result is used for indicating whether attendance is false. By extracting the characteristic data of the attendance data and training and adjusting the attendance model, the attendance false data can be predicted, and the technical problem that the false attendance data cannot be predicted in the related technology is solved.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flow diagram of an attendance prediction method according to an embodiment of the present disclosure;
fig. 2 is a flow diagram of an attendance prediction method according to another embodiment of the present disclosure;
fig. 3 is a schematic diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those skilled in the art, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only some embodiments of the present disclosure, not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the present disclosure may be described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
According to an embodiment of the present disclosure, an attendance prediction method is provided, as shown in fig. 1, the method includes the following steps 101 to 104:
step 101: after attendance data of a large number of users are acquired from the attendance system and serve as sample data, the sample data are classified based on a preconfigured time interval and the time of card punching in the sample data.
In this embodiment, historical attendance data of a large number of employees can be collected through the data collector, normal attendance data and fraud attendance data can be obtained by marking the historical attendance data, the label of the fraud attendance data label can be marked by marking abnormal users and dates in the data, and the marked historical attendance data can be used as sample data.
The preconfigured time interval may be set based on the active time of the user in the data, and the attendance data of the user is different in different time intervals, so that the feature data extracted during the prediction is also different. For example, the user attendance is generally divided into two shifts, namely a white shift and a night shift, in a time interval corresponding to the two shifts, the information displayed by the attendance data of the white shift is richer, and the information displayed by the attendance data of the night shift is thinner. According to the embodiment, attendance prediction can be performed in a targeted manner by dividing different time intervals for training, so that the prediction result is more accurate.
The sample data comprises the time point of the card punching, and the sample data is classified into the configured time intervals based on the time point of the card punching, so that the sample data in different time intervals is obtained.
Step 102: and extracting data of a preset dimension from the sample data in each time interval as characteristic data.
In this embodiment, data extraction is performed on each sample data in each interval, and data extraction may be performed according to a preset dimension.
As an optional implementation manner of this embodiment, the method further includes: and determining the dimension of the extracted data of the interval as the extraction basis of the feature data aiming at any time interval.
In this optional implementation manner, the user representation of the user attendance on the same day corresponding to the data can be obtained by screening the optimal feature data. Different time intervals require different features to be extracted, and the method of determining the features within each time interval may be based on hit rate and coverage rate determination. For any time interval, a plurality of characteristics with different dimensions can be preset, for each preset characteristic, the number of first samples with the same characteristic as the preset characteristic exists in the data with the abnormality in the time interval is determined, and the ratio of the number of the first samples to the number of the abnormal samples in the time interval is the coverage rate. And determining the number of second samples with the same characteristics as the preset characteristics in each preset characteristic in any time interval, wherein the ratio of the number of the second samples to the number of all samples in the interval is the hit rate. For example, the number of samples in the time interval from 3 months to 7 months is 15000, the number of abnormal samples is 3121, preset characteristics can include a1, a2 and … … An, and for any characteristic Am, if 600 pieces of data in the abnormal data 3121 have the characteristic, the coverage rate is 600/3121; if for any one feature Am, if 3000 pieces of data out of 15000 pieces of sample data have the feature, the hit rate is 3000/15000.
The preset number of the features may be multiple, and the preset number of the features includes user working hours, maximum continuous absent days of the user absent days of approximately 7 days, a difference between the user working hours on the day of card punching and the workers and friends in the same factory, a difference between the position of the user on the card punching and the longitude and latitude of the previous card punching and office card punching, a difference between the user shift date and the shift date of the workers and friends in the same factory, a difference between the user time of card punching and the time of card punching on the day of time and the time of card punching on the day of approximately 7 days, the number of absent days of the user on the day of day, the number of times of factory entry and exit during attendance checking of the user, and the features are only schematic, and in these features, the features may be used to finally determine the feature data finally required by the model.
After the hit rate and the coverage rate are determined, whether the hit rate belongs to a preset first threshold interval or not and whether the coverage rate belongs to a second threshold interval or not can be judged, and under any characteristic, if the hit rate and the coverage rate both accord with the preset interval (namely the values fall into the interval), the characteristic can be used as the characteristic of the model.
And acquiring data of the abnormal label in the time interval, determining the intersection of the characteristics in the abnormal label data and the preset characteristics, wherein the ratio of the number of the intersection to the number of the preset characteristics is the coverage rate. In the sample data of any time interval, if each piece of data has the same characteristics as any preset characteristics, the data is hit data, and the ratio of all hit data to the sample data is the hit rate. The features applicable to the model can be determined by the coverage and hit rates.
After the characteristics are determined, the original data of the characteristic data can be extracted, the data are processed by using characteristic engineering technologies such as PCA (principal component analysis) dimension reduction, missing value ratio and the like, and then the data can be input into a model for training.
As an optional implementation manner of this embodiment, for sample data in each time interval, extracting data of a preset dimension from the sample data as feature data includes: extracting motion data during card punching in the sample data aiming at the sample data in each time interval, wherein the motion data comprises the moving speed and the moving direction of a user before and after the card punching time in an attendance checking area; and extracting the card punching place data of the sample data aiming at the sample data in each time interval.
In this optional implementation manner, when the employee attendance card is punched, the movement data of the user can be acquired. The motion data may include a moving speed (the moving speed may be used as a determination factor of abnormal data, i.e., characteristic data) before and after the time when the user punches the card in the attendance checking area, and a moving direction (if the moving direction is contrary to or consistent with the direction of the punching position, both the moving speed and the moving direction may be used as determination factors of abnormal data).
As an optional implementation manner of this embodiment, for sample data in each time interval, power consumption data of a device in the sample data is extracted.
In the optional implementation mode, the power consumption data of the card punching equipment can be continuously monitored when the employee punches the card (the power consumption can be used as a judgment factor for judging whether the employee normally goes to work)
It will be appreciated that the above features are merely exemplary and may also include one or more of the above-mentioned (or non-mentioned) predetermined features.
Step 103: and taking the characteristic data as the input of a pre-established model, and outputting the prediction scores of the sample data in different time intervals.
In this implementation, the pre-established model can predict attendance data. The extracted feature data can be used as the input of the model; and the model processes the input data and outputs a prediction score.
As an optional implementation manner of this embodiment, the using the feature data as an input of the pre-established model includes: respectively taking the characteristic data as the input of a pre-established XGB model, a light TGBM model and a random forest model; carrying out model combination on results output by different models; and determining the final predicted value of each sample datum in a weight voting mode, wherein the predicted value is used for representing the probability of attendance false behavior.
In the optional implementation mode, the adopted model is optimized by adopting Bayesian super-parameter selection, network search super-parameter selection and other modes, the model algorithm is optimized by adopting a mode of simultaneously controlling L1 and L2 regularization by adopting the maximum classification tree number and the minimum classification tree layer number in the algorithm optimization process, the generalization capability of the uneven data model is reduced, and the uneven data model is verified with the initial model. Model combination of the final model prediction results gives the comprehensive probability of the user to generate fraud by adopting a weight voting mode (the weight of the model is calculated according to the performance of the model on each test set at F1 Score). And when the comprehensive probability reaches a set prediction threshold value, determining that the data is abnormal.
When the model is selected, an optimal model can be screened by adopting a multi-algorithm comparison mode, and the screening index is an evaluation ROC curve of the comprehensive accuracy and the discrimination of the model prediction on a training set, a test set and a test set (Receiver operating characteristics).
For example, the probability values given by the three models to the user for the occurrence of the abnormality may be weighted by a weight ratio of 3.4 × 3 × 3.6 to obtain final probability values. The weight calculation is derived from the weights of the ROC values of the three models. And the early warning of the user with the probability of about 70 percent of abnormal conditions can be given to the model.
Step 104: and determining an attendance prediction result corresponding to the prediction score in different time intervals based on the prediction threshold corresponding to the different preconfigured time intervals and the prediction score, wherein the prediction result is used for indicating whether attendance is false.
In this embodiment, the result corresponding to the output score may be determined based on a preset threshold, and if the result is greater than the preset threshold, it is determined that there is a false attendance, otherwise, there is no false attendance. The output result comprises time interval information, a prediction score and whether the score indicates that the attendance is false.
As an optional implementation manner of this embodiment, based on feedback data of sample data, whether a prediction result is correct is determined; if the prediction is incorrect, adjusting the output prediction result; determining a prediction threshold value of the model based on all adjusted prediction results, wherein when the prediction score is larger than the prediction threshold value, the prediction result indicates that attendance is false; otherwise no artifacts exist.
In this optional implementation, the sample data may include that it is determined that there is a false attendance, and the model may be optimized and modified based on this. The output results which are verified to have false attendance can be compared, whether the prediction result has errors or not is judged, if yes, the prediction result can be changed, and then the value of the threshold value can be readjusted based on all the changed prediction results, so that the prediction result is more accurate.
As an optional implementation manner of this embodiment, if the prediction threshold is adjusted, the attendance prediction result is changed based on the prediction threshold.
In this alternative implementation, the model may automatically adjust the prediction result after the prediction threshold is adjusted.
According to the method and the device, the attendance false behavior is predicted, and the prediction accuracy is improved.
According to an embodiment of the present disclosure, there is also provided an attendance prediction method, as shown in fig. 2, the method includes:
step 201: and acquiring the card punching data of the employee in real time in response to the received employee card punching request.
In this embodiment, if a card punching request of an employee is received, card punching data of the employee may be obtained, and then feature extraction is performed on the card punching data to obtain feature data (the feature data is the same as the feature data of the previous embodiment, and is not described herein again).
Step 202: inputting the card punching data into a model, and outputting an attendance prediction result;
in this embodiment, the time interval corresponding to the time of card-punching, the value of attendance prediction, and the result corresponding to the value are output through the model of the first embodiment, and the attendance prediction result is sent to a preset user side to verify the prediction result.
Step 203: and sending the attendance prediction result to a preset user side so as to verify the prediction result.
In this embodiment, the data predicted as false attendance in the attendance prediction result may be sent to the preset user side, and the preset user side verifies whether the data is false.
As an optional implementation manner of this embodiment, the model is optimized based on the predicted spurious data.
In this optional implementation, the model may self-learn to optimize the model based on at least one result obtained from the real-time prediction. The model may also be optimized in the manner of predictive threshold modification as disclosed in the first embodiment.
As an optional implementation manner of this embodiment, if the prediction threshold is adjusted, the attendance prediction result is changed based on the prediction threshold.
In this optional implementation manner, the management end may adjust the prediction threshold, and by adjusting the prediction threshold, the requirement of attendance, for example, the requirement to be met in the feature data that needs to be included in the attendance data, may be improved. The flexibility of attendance is improved through the mode.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.
According to an embodiment of the present disclosure, there is also provided an apparatus for implementing the attendance prediction method, the apparatus including: the system comprises a preprocessing unit and a time-sharing unit, wherein the preprocessing unit is configured to classify sample data based on a preconfigured time interval and the time of punching a card in the sample data after the attendance data of a large number of users are acquired from an attendance system as the sample data; a feature extraction unit configured to extract data of a preset dimension as feature data from the sample data in each time interval; a model training unit configured to output the prediction scores of the sample data in different time intervals with the feature data as an input of a pre-established model; the model training unit is further configured to determine an attendance prediction result corresponding to the prediction score in different time intervals based on the prediction threshold and the prediction score corresponding to the different pre-configured time intervals, wherein the prediction result is used for indicating whether the attendance is false or not.
The device still includes: and determining the dimension of the extracted data of the interval as the extraction basis of the feature data aiming at any time interval.
As an optional implementation manner of this embodiment, the using the feature data as an input of the pre-established model includes: respectively taking the characteristic data as the input of a pre-established XGB model, a light TGBM model and a random forest model; carrying out model combination on results output by different models; and determining the final predicted value of each sample datum in a weight voting mode, wherein the predicted value is used for representing the probability of attendance false behavior.
The embodiment of the present disclosure provides an electronic device, as shown in fig. 3, the electronic device includes one or more processors 31 and a memory 32, where one processor 31 is taken as an example in fig. 3.
The controller may further include: an input device 33 and an output device 34.
The processor 31, the memory 32, the input device 33 and the output device 34 may be connected by a bus or other means, and fig. 3 illustrates the connection by a bus as an example.
The processor 31 may be a Central Processing Unit (CPU). The processor 31 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or combinations thereof. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 32, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the control methods in the embodiments of the present disclosure. The processor 31 executes various functional applications of the server and data processing by running non-transitory software programs, instructions and modules stored in the memory 32, i.e. implements the method of the above-described method embodiment.
The memory 32 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of a processing device operated by the server, and the like. Further, the memory 32 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 32 may optionally include memory located remotely from the processor 31, which may be connected to a network connection device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 33 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the processing device of the server. The output device 34 may include a display device such as a display screen.
One or more modules are stored in the memory 32, which when executed by the one or more processors 31 perform the method as shown in fig. 1.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium, and when executed, the program can include the processes of the embodiments of the motor control methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-only memory (ROM), a Random Access Memory (RAM), a flash memory (FlashMemory), a hard disk (hard disk drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.
Although the embodiments of the present disclosure have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the present disclosure, and such modifications and variations fall within the scope defined by the appended claims.

Claims (10)

1. An attendance prediction method is characterized by comprising the following steps:
after attendance data of a large number of users are acquired from an attendance system and serve as sample data, the sample data are classified based on a preconfigured time interval and the time of card punching in the sample data;
extracting data of a preset dimension from the sample data in each time interval as characteristic data;
taking the characteristic data as the input of a pre-established model, and outputting the prediction scores of the sample data in different time intervals;
and determining an attendance prediction result corresponding to the prediction score in different time intervals based on the prediction threshold corresponding to the different preconfigured time intervals and the prediction score, wherein the prediction result is used for indicating whether attendance is false.
2. The attendance prediction method of claim 1, further comprising:
judging whether the prediction result is correct or not based on the feedback data of the sample data;
if the prediction is incorrect, adjusting the output prediction result;
determining a prediction threshold value of the model based on all adjusted prediction results, wherein when the prediction score is larger than the prediction threshold value, the prediction result indicates that attendance is false; otherwise no artifacts exist.
3. The attendance prediction method of claim 1, further comprising: and determining the dimension of the extracted data of the interval as the extraction basis of the feature data aiming at any time interval.
4. The attendance prediction method of claim 1, wherein entering the characteristic data as input to a pre-established model comprises:
respectively taking the characteristic data as the input of a pre-established XGB model, a light TGBM model and a random forest model;
carrying out model combination on results output by different models;
and determining the final predicted value of each sample datum in a weight voting mode, wherein the predicted value is used for representing the probability of attendance false behavior.
5. The attendance prediction method according to claim 1, wherein for the sample data in each time interval, extracting data of a preset dimension therefrom as feature data comprises:
extracting motion data of the sample data during card punching aiming at the sample data in each time interval;
and extracting the card punching place data of the sample data aiming at the sample data in each time interval.
6. The attendance prediction method of claim 5, further comprising:
and extracting the power consumption data of the equipment in the sample data aiming at the sample data in each time interval.
7. The attendance prediction method of claim 1, further comprising:
if the prediction threshold is adjusted, the attendance prediction result is changed based on the prediction threshold.
8. An attendance prediction method implemented based on any one of claims 1 to 7, comprising:
in response to receiving a card punching request of an employee, obtaining card punching data of the employee in real time;
inputting the card punching data into a model, and outputting an attendance prediction result;
and sending the attendance prediction result to a preset user side so as to verify the prediction result.
9. An attendance prediction apparatus, comprising:
the system comprises a preprocessing unit and a time-sharing unit, wherein the preprocessing unit is configured to classify sample data based on a preconfigured time interval and the time of punching a card in the sample data after the attendance data of a large number of users are acquired from an attendance system as the sample data;
a feature extraction unit configured to extract data of a preset dimension as feature data from the sample data in each time interval;
a model training unit configured to output the prediction scores of the sample data in different time intervals with the feature data as an input of a pre-established model;
the model training unit is further configured to determine an attendance prediction result corresponding to the prediction score in different time intervals based on the prediction threshold and the prediction score corresponding to the different pre-configured time intervals, wherein the prediction result is used for indicating whether the attendance is false or not.
10. A computer-readable storage medium having stored thereon computer instructions for causing the computer to perform the attendance prediction method of any of claims 1-8.
CN202111550306.0A 2021-12-17 2021-12-17 Attendance checking prediction method and device Pending CN114358395A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111550306.0A CN114358395A (en) 2021-12-17 2021-12-17 Attendance checking prediction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111550306.0A CN114358395A (en) 2021-12-17 2021-12-17 Attendance checking prediction method and device

Publications (1)

Publication Number Publication Date
CN114358395A true CN114358395A (en) 2022-04-15

Family

ID=81099838

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111550306.0A Pending CN114358395A (en) 2021-12-17 2021-12-17 Attendance checking prediction method and device

Country Status (1)

Country Link
CN (1) CN114358395A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196560A (en) * 2023-11-07 2023-12-08 深圳市慧云智跑网络科技有限公司 Data acquisition method and system of card punching equipment based on Internet of things

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196560A (en) * 2023-11-07 2023-12-08 深圳市慧云智跑网络科技有限公司 Data acquisition method and system of card punching equipment based on Internet of things
CN117196560B (en) * 2023-11-07 2024-02-13 深圳市慧云智跑网络科技有限公司 Data acquisition method and system of card punching equipment based on Internet of things

Similar Documents

Publication Publication Date Title
WO2021232229A1 (en) Virtual scene generation method and apparatus, computer device and storage medium
US20170364821A1 (en) Method and system for analyzing driver behaviour based on telematics data
JP2019533242A (en) System and method for predicting fraud in automobile warranty
US7778715B2 (en) Methods and systems for a prediction model
CN112800116B (en) Method and device for detecting abnormity of service data
CN112188531B (en) Abnormality detection method, abnormality detection device, electronic apparatus, and computer storage medium
CN104123592B (en) Bank's backstage TPS transaction events trend forecasting method and system
CN109918279B (en) Electronic device, method for identifying abnormal operation of user based on log data and storage medium
US11119472B2 (en) Computer system and method for evaluating an event prediction model
CN111460312A (en) Method and device for identifying empty-shell enterprise and computer equipment
GB2507186A (en) Automatic Output of Severe Weather Warnings at a Mobile Computing Device
CN115280337A (en) Machine learning based data monitoring
CN111768040A (en) Model interpretation method, device, equipment and readable storage medium
EP3421315B1 (en) Systems and methods for authenticating drivers based on gps data
CN116415931A (en) Big data-based power equipment operation state monitoring method and system
CN114358395A (en) Attendance checking prediction method and device
CN115796826A (en) Management method, system, device and storage medium for ship safety management and control
Sun et al. On the tradeoff between sensitivity and specificity in bus bunching prediction
CN105162931B (en) The sorting technique and device of a kind of communicating number
CN117094184B (en) Modeling method, system and medium of risk prediction model based on intranet platform
CN113282920A (en) Log abnormity detection method and device, computer equipment and storage medium
CN113032239A (en) Risk prompting method and device, electronic equipment and storage medium
CN117368862A (en) High-efficiency weather radar data quality evaluation system
US11915180B2 (en) Systems and methods for identifying an officer at risk of an adverse event
CN110969209B (en) Stranger identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination