CN110797030A - Method and system for working hour statistics based on voice recognition - Google Patents

Method and system for working hour statistics based on voice recognition Download PDF

Info

Publication number
CN110797030A
CN110797030A CN201911018731.8A CN201911018731A CN110797030A CN 110797030 A CN110797030 A CN 110797030A CN 201911018731 A CN201911018731 A CN 201911018731A CN 110797030 A CN110797030 A CN 110797030A
Authority
CN
China
Prior art keywords
conversation
voice
time
staff
working
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911018731.8A
Other languages
Chinese (zh)
Other versions
CN110797030B (en
Inventor
朱树荫
梁志婷
徐浩
吴明辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Mingsheng Pinzhi Artificial Intelligence Technology Co ltd
Original Assignee
Miaozhen Systems Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Miaozhen Systems Information Technology Co Ltd filed Critical Miaozhen Systems Information Technology Co Ltd
Priority to CN201911018731.8A priority Critical patent/CN110797030B/en
Publication of CN110797030A publication Critical patent/CN110797030A/en
Application granted granted Critical
Publication of CN110797030B publication Critical patent/CN110797030B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/109Time management, e.g. calendars, reminders, meetings or time accounting
    • G06Q10/1091Recording time for administrative or management purposes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A method, system, and computer-readable storage medium for man-hour statistics based on speech recognition, wherein the method comprises: collecting voice information of the staff for conversation in the working time period; identifying the voice print characteristics of the voice information; determining the conversation starting time and the conversation ending time of the voice message confirmed through the identity; and determining a conversation voice time period according to the conversation starting time and the conversation ending time, and counting the actual working hours of the staff in the working time period according to the conversation voice time period. The embodiment of the application monitors the on-duty condition of the staff based on the voice recognition, and realizes automatic statistics of actual working hours of the staff, so that the working intensity of the staff can be quantified, and effective data support is provided for performance indexes of the staff.

Description

Method and system for working hour statistics based on voice recognition
Technical Field
This document relates to the field of performance assessment, and more particularly, to a method, system, and computer-readable storage medium for speech recognition-based man-hour statistics.
Background
In an online store, some video monitoring systems are adopted to shoot on-duty image information of employees all the day long, and the monitoring pictures are replayed or checked by a specially-assigned person in real time to know the on-duty working condition of the employees; in this way, a large labor cost is required, and a large amount of video pictures require a large storage space.
In addition, the actual working hours of the staff in the work (for example, the duration of the time that the shopping guide person receives the customer and has a conversation with the customer) are not counted accurately and objectively, and the working intensity of the staff cannot be determined.
Disclosure of Invention
The application provides a method, a system and a computer readable storage medium for man-hour statistics based on voice recognition, so as to automatically count actual man-hours of employees.
The embodiment of the application provides a method for working hour statistics based on voice recognition, which comprises the following steps:
collecting voice information of the staff for conversation in the working time period;
identifying the voice print characteristics of the voice information;
determining the conversation starting time and the conversation ending time of the voice message confirmed through the identity;
and determining a conversation voice time period according to the conversation starting time and the conversation ending time, and counting the actual working hours of the staff in the working time period according to the conversation voice time period.
In an embodiment, the method further comprises:
and modeling the voiceprint of each employee in advance to obtain an individual voiceprint model of each employee.
In an embodiment, the identity verification by recognizing a voiceprint feature of the voice message includes:
and extracting the characteristics of the voice information, comparing and matching the obtained voiceprint characteristics with the individual voiceprint model, and confirming the identity if the matching is successful.
In one embodiment, the determining the session start time and the session end time of the voice message confirmed by the identity includes:
and analyzing the voice information confirmed by the identity, determining that a section of conversation is ended when the conversation interval duration is greater than a preset conversation blank threshold, and recording corresponding conversation starting time and conversation ending time.
In an embodiment, the method further comprises:
and determining the working intensity of the staff according to the actual working hours of the staff in the working time period and the total working time period of the staff.
The embodiment of the present application further provides a method for performing a man-hour statistic based on speech recognition, including:
acquiring voice information of an employee during a working period, and determining the conversation starting time and the conversation ending time of the voice information;
and determining a conversation voice time period according to the conversation starting time and the conversation ending time, and counting the actual working hours of the staff in the working time period according to the conversation voice time period.
In one embodiment, the determining the session start time and the session end time of the voice message includes:
and analyzing the voice information, determining that a section of conversation is ended when the conversation interval duration in the voice information is greater than a preset conversation blank threshold, and recording corresponding conversation starting time and conversation ending time.
The embodiment of the present application further provides a system for counting man-hours, including a server and a plurality of mobile terminals, wherein:
the mobile terminal is used for collecting voice information of the staff for conversation in the working period, confirming the identity by identifying the voiceprint characteristics of the voice information and sending the voice information confirmed by the identity to the server;
the server is used for determining the conversation starting time and the conversation ending time of the voice information confirmed through identity, determining a conversation voice time period according to the conversation starting time and the conversation ending time, and counting the actual working hours of the employee in the working time period according to the conversation voice time period.
An embodiment of the present application further provides a server, including: the processor executes the program to implement the method for man-hour statistics based on speech recognition.
The embodiment of the application also provides a computer-readable storage medium, which stores computer-executable instructions for executing the method for the man-hour statistics based on the voice recognition.
Compared with the related art, the method comprises the following steps: collecting voice information of the staff for conversation in the working time period; identifying the voice print characteristics of the voice information; determining the conversation starting time and the conversation ending time of the voice message confirmed through the identity; and determining a conversation voice time period according to the conversation starting time and the conversation ending time, and counting the actual working hours of the staff in the working time period according to the conversation voice time period. The embodiment of the application monitors the on-duty condition of the staff based on the voice recognition, and realizes automatic statistics of actual working hours of the staff, so that the working intensity of the staff can be quantified, and effective data support is provided for performance indexes of the staff.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. Other advantages of the application may be realized and attained by the instrumentalities and combinations particularly pointed out in the specification, claims, and drawings.
Drawings
The accompanying drawings are included to provide an understanding of the present disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the examples serve to explain the principles of the disclosure and not to limit the disclosure.
FIG. 1 is a flow chart of a method of man-hour statistics according to an embodiment of the present application;
fig. 2 is a schematic diagram of an implementation of the embodiment of the present application in a manner of combining a mobile terminal and a server;
FIG. 3 is a flowchart of a method for computing a working hour (applied to a mobile terminal) according to an embodiment of the present application;
FIG. 4 is a flowchart of a method for time-of-day statistics (applied to a server) according to an embodiment of the present application;
fig. 5 is a schematic diagram of a system for man-hour statistics according to an embodiment of the present application.
Detailed Description
The present application describes embodiments, but the description is illustrative rather than limiting and it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible within the scope of the embodiments described herein. Although many possible combinations of features are shown in the drawings and discussed in the detailed description, many other combinations of the disclosed features are possible. Any feature or element of any embodiment may be used in combination with or instead of any other feature or element in any other embodiment, unless expressly limited otherwise.
The present application includes and contemplates combinations of features and elements known to those of ordinary skill in the art. The embodiments, features and elements disclosed in this application may also be combined with any conventional features or elements to form a unique inventive concept as defined by the claims. Any feature or element of any embodiment may also be combined with features or elements from other inventive aspects to form yet another unique inventive aspect, as defined by the claims. Thus, it should be understood that any of the features shown and/or discussed in this application may be implemented alone or in any suitable combination. Accordingly, the embodiments are not limited except as by the appended claims and their equivalents. Furthermore, various modifications and changes may be made within the scope of the appended claims.
Further, in describing representative embodiments, the specification may have presented the method and/or process as a particular sequence of steps. However, to the extent that the method or process does not rely on the particular order of steps set forth herein, the method or process should not be limited to the particular sequence of steps described. Other orders of steps are possible as will be understood by those of ordinary skill in the art. Therefore, the particular order of the steps set forth in the specification should not be construed as limitations on the claims. Further, the claims directed to the method and/or process should not be limited to the performance of their steps in the order written, and one skilled in the art can readily appreciate that the sequences may be varied and still remain within the spirit and scope of the embodiments of the present application.
The embodiment of the application provides a method and a system for working hour statistics based on voice recognition, which are used for monitoring the working condition of an employee and knowing the actual working hours, the passenger flow volume condition and other information of the employee in the working time.
The embodiment of the application can be applied to the performance assessment of the staff who needs to have conversation communication with the client frequently, for example, the performance assessment of the staff of the shopping guide personnel, the telephone operator and the like.
As shown in fig. 1, a method for computing man-hour based on speech recognition in an embodiment of the present application includes:
step 101, collecting voice information of the staff performing conversation in the working period.
The voice information can be acquired by wearing a terminal with a voice acquisition function by the staff at work.
And 102, identifying the identity by identifying the voiceprint characteristics of the voice information.
In an embodiment, the method further comprises:
and modeling the voiceprint of each employee in advance to obtain an individual voiceprint model of each employee.
The individual voiceprint model is provided with a plurality of voiceprint data, each voiceprint data specifically corresponds to a voiceprint of a plurality of keywords, and the keywords can be set as common words for calling and calling during working of the staff, such as 'good morning', 'good you' and 'asking for questions'. For example, when a shopper faces a customer, the user usually stores voiceprint characteristics of words such as "you' y", "morning good", and "ask" as the start of a conversation in advance in an individual voiceprint model, and can quickly recognize the words when performing identification.
When voice information is subjected to recognition processing, when a section of voice information has a voiceprint feature corresponding to a voiceprint model of a keyword, the section of voice information is considered to be from an employee rather than other unrelated subjects (for example, a recording device capable of playing voice).
In one embodiment, step 102 comprises:
and extracting the characteristics of the voice information, comparing and matching the obtained characteristics with the individual voiceprint model, and confirming the identity if the matching is successful.
After the characteristics of the collected voice information are extracted, traversing the voiceprint data in the individual voiceprint model for matching; in the matching process, a similarity threshold value can be set, matching calculation is carried out by adopting a similarity algorithm, and when the calculation result is within the range of the similarity threshold value, the voice of the corresponding staff can be judged.
And 103, determining the conversation starting time and the conversation ending time of the voice message confirmed by the identity.
In one embodiment, step 103 comprises:
and analyzing the voice information confirmed by the identity, determining that a section of conversation is ended when the conversation interval duration is greater than a preset conversation blank threshold, and recording corresponding conversation starting time and conversation ending time.
A dialogue blank threshold is preset, and when the dialogue interval duration in the voice message is less than or equal to the dialogue blank threshold, a dialogue section is determined; and when the conversation interval duration in the voice message is greater than the conversation blank threshold, judging that a section of conversation is ended. For example, in a piece of speech information with a total duration of 2min, there are a plurality of interval durations without conversation speech, but each interval duration is smaller than a conversation blank threshold (for example, it can be set to 15s), and the speech information is determined to be a 2min conversation segment.
And step 104, determining a conversation voice time period according to the conversation starting time and the conversation ending time, and counting the actual working hours of the staff in the working time period according to the conversation voice time period.
Wherein, the speaking start time point and the speaking end time point in each dialogue section are recorded, thereby calculating the dialogue time length T of each dialogue sectioni(i is the number of dialog segments).
The actual working hours of the staff in the working hours are equal to the conversation duration T of each conversation periodiThe sum of (a) and (b).
In an embodiment, the method further comprises:
and determining the working intensity of the staff according to the actual working hours of the staff in the working time period and the total working time period of the staff.
Wherein, can transfer the information table of arranging duty, acquire staff's working time quantum, the statistics takes place the conversation duration of all conversation sections in the working time quantum to calculate actual man-hour proportion P:
Figure BDA0002246513500000061
wherein, TGeneral assemblyIs the total duration of the operating period.
The working strength of the staff can be known according to the actual working hour proportion P.
The embodiment of the application monitors the on-duty condition of the staff based on the voice recognition, and realizes automatic statistics of actual working hours of the staff, so that the working intensity of the staff can be quantified, and effective data support is provided for performance indexes of the staff.
As shown in fig. 2, the embodiment of the present application may be implemented in a manner of combining a mobile terminal and a server.
For a mobile terminal, as shown in fig. 3, a method for computing a working hour based on speech recognition according to an embodiment of the present application includes:
step 301, collecting voice information of the employee during the working period.
The mobile terminal has a voice acquisition function, and the employee wears the mobile terminal to acquire the voice information when working.
And step 302, identifying the identity by identifying the voiceprint characteristics of the voice information.
And modeling the voiceprint of each employee in advance to obtain an individual voiceprint model of each employee.
The individual voiceprint model is provided with a plurality of voiceprint data, each voiceprint data specifically corresponds to a voiceprint of a plurality of keywords, and the keywords can be set as common words for calling and calling during working of the staff, such as 'good morning', 'good you' and 'asking for questions'. For example, when a shopper faces a customer, the user usually stores voiceprint characteristics of words such as "you' y", "morning good", and "ask" as the start of a conversation in advance in an individual voiceprint model, and can quickly recognize the words when performing identification.
When voice information is subjected to recognition processing, when a section of voice information has a voiceprint feature corresponding to a voiceprint model of a keyword, the section of voice information is considered to be from an employee rather than other unrelated subjects (for example, a recording device capable of playing voice).
In one embodiment, step 302 includes:
and extracting the characteristics of the voice information, comparing and matching the obtained characteristics with the individual voiceprint model, and confirming the identity if the matching is successful.
And if the matching is unsuccessful, discarding the voice information.
Step 303, sending the voice information passing the identity confirmation to the server, so that the server determines a conversation voice time period, and counts the actual working hours of the employee in the working time period.
And if the voiceprint features extracted from the acquired voice information are matched with the voiceprint features stored in the mobile terminal, automatically transmitting the data to the server.
The mobile terminals in the embodiment of the application are provided with a plurality of mobile terminals, can be worn on a human body, and are pre-bound with IDs of employees. Therefore, the data sent to the server side includes: voice information and personal ID information. When the voiceprint features are matched, the individual voiceprint model corresponding to the mobile terminal user is found through the personal ID information, and then the voiceprint features extracted from the voice information are compared and matched with the voiceprint data in the individual voiceprint model one by one, so that the process of data traversal is reduced, and the speed of voice recognition processing is improved.
For the server, as shown in fig. 4, the method for computing the working hours based on the speech recognition according to the embodiment of the present application includes:
step 401, acquiring voice information of an employee during a conversation in a working period, and determining a conversation start time and a conversation end time of the voice information.
The server receives voice information sent by the mobile terminal, wherein the voice information is voice information confirmed through identity.
In one embodiment, step 401 comprises:
and analyzing the voice information, determining that a section of conversation is ended when the conversation interval duration in the voice information is greater than a preset conversation blank threshold, and recording corresponding conversation starting time and conversation ending time.
A dialogue blank threshold is preset, and when the dialogue interval duration in the voice message is less than or equal to the dialogue blank threshold, a dialogue section is determined; and when the conversation interval duration in the voice message is greater than the conversation blank threshold, judging that a section of conversation is ended. For example, in a piece of speech information with a total duration of 2min, there are a plurality of interval durations without conversation speech, but each interval duration is smaller than a conversation blank threshold (for example, it can be set to 15s), and the speech information is determined to be a 2min conversation segment.
Step 402, determining a conversation voice time period according to the conversation starting time and the conversation ending time, and counting the actual working hours of the staff in the working time period according to the conversation voice time period.
Wherein, the speaking start time point and the speaking end time point in each dialogue section are recorded, thereby calculating the dialogue time length T of each dialogue sectioni(i is the number of dialog segments).
The actual working hours of the staff in the working hours are equal to the conversation duration T of each conversation periodiThe sum of (a) and (b).
In an embodiment, the method further comprises:
and determining the working intensity of the staff according to the actual working hours of the staff in the working time period and the total working time period of the staff.
Wherein, can transfer the information table of arranging duty, acquire staff's working time quantum, the statistics takes place the conversation duration of all conversation sections in the working time quantum to calculate actual man-hour proportion P:
Figure BDA0002246513500000091
wherein, TGeneral assemblyIs the total duration of the operating period.
The working strength of the staff can be known according to the actual working hour proportion P.
As shown in fig. 5, an embodiment of the present application provides a system for man-hour statistics, including a server and a plurality of mobile terminals, where:
the mobile terminal 51 is used for collecting voice information of the staff performing conversation in the working period, performing identity confirmation by identifying the voiceprint characteristics of the voice information, and sending the voice information passing the identity confirmation to the server;
the server 52 is configured to determine a session start time and a session end time of the voice message that passes identity confirmation, determine a session voice time period according to the session start time and the session end time, and count actual working hours of the employee during the working time period according to the session voice time period.
The mobile terminal 51 is provided with:
(1) voice acquisition module 511
The voice collecting module 511 is used for automatically collecting voice information.
(2) Speech recognition module 512
The voice recognition module 512 includes: an individual voiceprint model and an identity recognition unit.
Modeling the voiceprints of all employees in advance to obtain individual voiceprint models; the individual voiceprint model has a plurality of voiceprint data, each voiceprint data specifically corresponds to a voiceprint of a plurality of keywords, and the keywords can be set as common words for calling and calling during shopping guide work, such as 'good morning', 'good you' and 'asking for questions'. Usually, when a shopping guide person faces a customer, the user can store the voiceprint characteristics of words such as 'you good', 'morning good', 'ask for a question' and the like as the start of a conversation in advance into an individual voiceprint model, and the user can quickly recognize the voiceprint characteristics when performing identity recognition.
And the identity recognition unit is used for extracting the voiceprint features of the voice information, comparing and matching the voiceprint features with the individual voiceprint model, and confirming the identity to obtain identity confirmation information.
(3) Data transmission module 513
And the data transmission module 513 transmits the voice information to the server in real time after the identity identification unit confirms the identity. In the scheme, if the voiceprint features extracted from the collected voice information are matched with the voiceprint features stored in the mobile terminal, the data are automatically transmitted to the server side.
In the scheme, a plurality of mobile terminals 51 can be worn on a human body, and each mobile terminal 51 is previously bound with each employee by an ID. Therefore, the data transmitted by the data transmission module to the server side includes: voice information and personal ID information.
The server 52 is provided with:
(1) dialog duration calculation module 521
Splitting voice information to obtain dialog segments: presetting a dialogue blank threshold, and judging as a dialogue section when the dialogue interval duration in the voice information is less than or equal to the dialogue blank threshold; and when the conversation interval duration in the voice message is greater than the conversation blank threshold, judging that a section of conversation is ended. For example, in a piece of voice information with a total duration of 2min, there are a plurality of interval durations without conversation voice, but each interval duration is smaller than a conversation blank threshold (which can be set to 15s), and the voice information is determined to be a 2min conversation segment.
And (3) calculating conversation duration: recording the speaking start time point and the speaking end time point in each dialog segment, thereby calculating the dialog time length T of each dialog segmenti(i is the number of dialog segments).
(2) Actual man-hour statistics module 522
Calling a scheduling information table, acquiring the working time period of the staff, and counting the conversation duration of all conversation periods in the working time period, so as to calculate the actual working hour proportion P:
Figure BDA0002246513500000111
wherein, TGeneral assemblyIs the total duration of the operating period.
And (4) knowing the working strength of the staff according to the actual working hour proportion P.
(3) Data management module 523
Managing the data uploaded by the mobile terminal 51, constructing a corresponding sub-database for each mobile terminal according to the ID information of the mobile terminal, and storing the data uploaded by different mobile terminals into the corresponding sub-databases; and storing the corresponding ID into a scheduling information table of the employee so as to obtain the working time period corresponding to the ID.
In summary, each sub-database includes the following contents: the ID of the mobile terminal, the working time period, the conversation duration, the conversation date and the actual working hour proportion.
It should be noted that, in the embodiment of the present application, the mobile terminal and the server are used in a manner of working separately and cooperatively to realize the working hour statistics, the mobile terminal performs voice information collection, voice information feature extraction and voice recognition verification, and the server is responsible for calculating the actual working hours; in other embodiments, other manners may also be adopted, for example, the mobile terminal is only responsible for voice information acquisition, and the server performs voice information feature extraction, voice recognition verification and actual working hour statistics, and the manner may be to quickly find the individual voiceprint model corresponding to the employee through the personal ID information of the employee, so as to realize quick matching of the voiceprint features; or, the mobile terminal may independently implement voice collection, voice information feature extraction, recognition and verification, man-hour statistics, and the like.
An embodiment of the present application further provides a mobile terminal, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the man-hour counting method when executing the program.
An embodiment of the present application further provides a server, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the man-hour counting method when executing the program.
The embodiment of the application also provides a computer-readable storage medium, which stores computer-executable instructions, wherein the computer-executable instructions are used for executing the method for the man-hour statistics.
In this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

Claims (10)

1. A method for man-hour statistics based on speech recognition is characterized by comprising the following steps:
collecting voice information of the staff for conversation in the working time period;
identifying the voice print characteristics of the voice information;
determining the conversation starting time and the conversation ending time of the voice message confirmed through the identity;
and determining a conversation voice time period according to the conversation starting time and the conversation ending time, and counting the actual working hours of the staff in the working time period according to the conversation voice time period.
2. The method for man-hour statistics based on speech recognition according to claim 1, further comprising:
and modeling the voiceprint of each employee in advance to obtain an individual voiceprint model of each employee.
3. The method for man-hour statistics based on speech recognition according to claim 1, wherein the identity verification by recognizing the voiceprint feature of the speech information comprises:
and extracting the characteristics of the voice information, comparing and matching the obtained voiceprint characteristics with the individual voiceprint model, and confirming the identity if the matching is successful.
4. The method of claim 1, wherein the determining the session start time and the session end time of the voice message confirmed by the identity comprises:
and analyzing the voice information confirmed by the identity, determining that a section of conversation is ended when the conversation interval duration is greater than a preset conversation blank threshold, and recording corresponding conversation starting time and conversation ending time.
5. The method for man-hour statistics based on speech recognition according to claim 1, further comprising:
and determining the working intensity of the staff according to the actual working hours of the staff in the working time period and the total working time period of the staff.
6. A method for man-hour statistics based on speech recognition is characterized by comprising the following steps:
acquiring voice information of an employee during a working period, and determining the conversation starting time and the conversation ending time of the voice information;
and determining a conversation voice time period according to the conversation starting time and the conversation ending time, and counting the actual working hours of the staff in the working time period according to the conversation voice time period.
7. The method of claim 6, wherein determining the session start time and the session end time of the voice message comprises:
and analyzing the voice information, determining that a section of conversation is ended when the conversation interval duration in the voice information is greater than a preset conversation blank threshold, and recording corresponding conversation starting time and conversation ending time.
8. A system for man-hour statistics, comprising a server and a plurality of mobile terminals, wherein:
the mobile terminal is used for collecting voice information of the staff for conversation in the working period, confirming the identity by identifying the voiceprint characteristics of the voice information and sending the voice information confirmed by the identity to the server;
the server is used for determining the conversation starting time and the conversation ending time of the voice information confirmed through identity, determining a conversation voice time period according to the conversation starting time and the conversation ending time, and counting the actual working hours of the employee in the working time period according to the conversation voice time period.
9. A server, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 6 to 7 when executing the program.
10. A computer-readable storage medium storing computer-executable instructions for performing the method of any one of claims 1-7.
CN201911018731.8A 2019-10-24 2019-10-24 Method and system for working hour statistics based on voice recognition Active CN110797030B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911018731.8A CN110797030B (en) 2019-10-24 2019-10-24 Method and system for working hour statistics based on voice recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911018731.8A CN110797030B (en) 2019-10-24 2019-10-24 Method and system for working hour statistics based on voice recognition

Publications (2)

Publication Number Publication Date
CN110797030A true CN110797030A (en) 2020-02-14
CN110797030B CN110797030B (en) 2022-06-07

Family

ID=69441140

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911018731.8A Active CN110797030B (en) 2019-10-24 2019-10-24 Method and system for working hour statistics based on voice recognition

Country Status (1)

Country Link
CN (1) CN110797030B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113506097A (en) * 2021-09-10 2021-10-15 北京明略昭辉科技有限公司 On-duty state monitoring method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106686223A (en) * 2016-12-19 2017-05-17 中国科学院计算技术研究所 A system and method for assisting dialogues between a deaf person and a normal person, and a smart mobile phone
CN107230478A (en) * 2017-05-03 2017-10-03 上海斐讯数据通信技术有限公司 A kind of voice information processing method and system
CN109509545A (en) * 2018-10-23 2019-03-22 平安医疗健康管理股份有限公司 Wire examination method of making the rounds of the wards, device, server and medium based on bio-identification
CN109657624A (en) * 2018-12-21 2019-04-19 秒针信息技术有限公司 Monitoring method, the device and system of target object
CN109660679A (en) * 2018-09-27 2019-04-19 深圳壹账通智能科技有限公司 Collection is attended a banquet monitoring method, device, equipment and the storage medium at end
CN109727600A (en) * 2017-10-26 2019-05-07 北京航天长峰科技工业集团有限公司 A kind of phrase sound method for identifying speaker unrelated based on text
US20190306644A1 (en) * 2014-06-23 2019-10-03 Glen A. Norris Controlling a location of binaural sound with a command

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190306644A1 (en) * 2014-06-23 2019-10-03 Glen A. Norris Controlling a location of binaural sound with a command
CN106686223A (en) * 2016-12-19 2017-05-17 中国科学院计算技术研究所 A system and method for assisting dialogues between a deaf person and a normal person, and a smart mobile phone
CN107230478A (en) * 2017-05-03 2017-10-03 上海斐讯数据通信技术有限公司 A kind of voice information processing method and system
CN109727600A (en) * 2017-10-26 2019-05-07 北京航天长峰科技工业集团有限公司 A kind of phrase sound method for identifying speaker unrelated based on text
CN109660679A (en) * 2018-09-27 2019-04-19 深圳壹账通智能科技有限公司 Collection is attended a banquet monitoring method, device, equipment and the storage medium at end
CN109509545A (en) * 2018-10-23 2019-03-22 平安医疗健康管理股份有限公司 Wire examination method of making the rounds of the wards, device, server and medium based on bio-identification
CN109657624A (en) * 2018-12-21 2019-04-19 秒针信息技术有限公司 Monitoring method, the device and system of target object

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113506097A (en) * 2021-09-10 2021-10-15 北京明略昭辉科技有限公司 On-duty state monitoring method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN110797030B (en) 2022-06-07

Similar Documents

Publication Publication Date Title
CN110827829A (en) Passenger flow analysis method and system based on voice recognition
EP3206205B1 (en) Voiceprint information management method and device as well as identity authentication method and system
CN111091832B (en) Intention assessment method and system based on voice recognition
CN111401826A (en) Double-recording method and device for signing electronic contract, computer equipment and storage medium
CN108038937B (en) Method and device for showing welcome information, terminal equipment and storage medium
CN109598809A (en) A kind of check class attendance method and system based on recognition of face
CN109858354B (en) Face identity library, track table establishment method and face track query method and system
CN108875989B (en) Reservation method and device based on face recognition, computer equipment and storage medium
CN109005104B (en) Instant messaging method, device, server and storage medium
CN112200556A (en) Transfer processing method and device of automatic teller machine
CN109389007B (en) Passenger flow volume statistical method, device and system
CN110797030B (en) Method and system for working hour statistics based on voice recognition
CN109829691B (en) C/S card punching method and device based on position and deep learning multiple biological features
CN112199530B (en) Multi-dimensional face library picture automatic updating method, system, equipment and medium
CN107491652B (en) Face abnormal feature data analysis method based on face recognition
CN105678510A (en) On-line interview method and system
CN108416063B (en) Agricultural problem communication method and device
WO2017005071A1 (en) Communication monitoring method and device
CN112071315A (en) Portable information acquisition device, method, storage medium and electronic device
CN112562644A (en) Customer service quality inspection method, system, equipment and medium based on human voice separation
CN109492626A (en) A kind of noninductive face case field management method and its system
CN113365100B (en) Video processing method and device
CN113409822A (en) Object state determination method and device, storage medium and electronic device
CN112052737A (en) Financial institution business outlet treatment method, system, storage medium and electronic equipment
CN110797029A (en) On-duty monitoring method, system and equipment based on voice recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210816

Address after: 200232 unit 5b06, floor 5, building 2, No. 277, Longlan Road, Xuhui District, Shanghai

Applicant after: Shanghai Mingsheng Pinzhi Artificial Intelligence Technology Co.,Ltd.

Address before: 100102 room 321008, 5 building, 1 Tung Fu Street East, Chaoyang District, Beijing.

Applicant before: MIAOZHEN INFORMATION TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant