CN110827829A - Passenger flow analysis method and system based on voice recognition - Google Patents

Passenger flow analysis method and system based on voice recognition Download PDF

Info

Publication number
CN110827829A
CN110827829A CN201911018711.0A CN201911018711A CN110827829A CN 110827829 A CN110827829 A CN 110827829A CN 201911018711 A CN201911018711 A CN 201911018711A CN 110827829 A CN110827829 A CN 110827829A
Authority
CN
China
Prior art keywords
client
voice information
voiceprint
dialogue
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911018711.0A
Other languages
Chinese (zh)
Inventor
朱树荫
梁志婷
徐浩
吴明辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Mingsheng Pinzhi Artificial Intelligence Technology Co.,Ltd.
Original Assignee
Miaozhen Systems Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Miaozhen Systems Information Technology Co Ltd filed Critical Miaozhen Systems Information Technology Co Ltd
Priority to CN201911018711.0A priority Critical patent/CN110827829A/en
Publication of CN110827829A publication Critical patent/CN110827829A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Accounting & Taxation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A passenger flow analysis method and system based on voice recognition are provided, wherein the method comprises the following steps: collecting conversation voice information of a conversation between an employee and a client; determining client voice information containing client voice by identifying employee voice information in the conversation voice information; determining client identity information according to the voiceprint characteristics of the client voice information; and carrying out statistical analysis on the customer flow information according to the customer identity information. According to the embodiment of the application, the passenger flow analysis is carried out based on the voice recognition, the automatic statistical analysis of the passenger flow is realized, the labor cost and the storage space are saved, and management personnel can reasonably distribute human resources according to the passenger flow condition.

Description

Passenger flow analysis method and system based on voice recognition
Technical Field
The present disclosure relates to the field of statistical analysis, and more particularly, to a method, system and computer-readable storage medium for analyzing passenger flow based on speech recognition.
Background
In online stores, managers pay attention to the passenger flow of the stores, some stores adopt a video monitoring system to shoot image information in the stores all day long, and know the passenger flow condition of the stores by replaying video pictures or checking monitoring pictures in real time by specially-assigned persons; in this way, a large labor cost is required, and a large amount of video pictures require a large storage space.
In addition, in the scenes of telephone shopping, service consultation and the like, managers also need to know the passenger flow situation.
Disclosure of Invention
The application provides a passenger flow analysis method, a passenger flow analysis system and a computer readable storage medium based on voice recognition, so as to automatically perform statistical analysis on passenger flow.
The embodiment of the application provides a passenger flow analysis method based on voice recognition, which comprises the following steps:
collecting conversation voice information of a conversation between an employee and a client;
determining client voice information containing client voice by identifying employee voice information in the conversation voice information;
determining client identity information according to the voiceprint characteristics of the client voice information;
and carrying out statistical analysis on the customer flow information according to the customer identity information.
In an embodiment, the method further comprises:
and modeling the voiceprint of the employee in advance to obtain an individual voiceprint model of the employee.
In one embodiment, the determining the customer voice information including the customer voice by recognizing the employee voice information in the dialogue voice information includes:
and extracting the characteristics of the dialogue voice information, comparing and matching the obtained voiceprint characteristics with the individual voiceprint models of the employees, marking the voice information which is successfully matched in the dialogue voice information according to the matching result, and determining that the voice information which is not marked in the dialogue voice information is the client voice information.
In one embodiment, the determining the client identity information according to the voiceprint feature of the client voice information includes:
splitting the dialogue voice information to obtain dialogue sections;
extracting the voiceprint characteristics of the client voice information in the dialog section;
and determining the client type to which the client voice information belongs according to the voiceprint characteristics of the client voice information.
In an embodiment, the splitting the dialog voice information to obtain a dialog segment includes:
and analyzing the dialogue voice information, comparing the dialogue interval duration in the dialogue voice information with a preset dialogue blank threshold value, determining the dialogue starting time and the dialogue ending time of the dialogue section, and splitting according to the dialogue starting time and the dialogue ending time to obtain the dialogue section.
In an embodiment, the determining, according to the voiceprint feature of the client voice information, a client type to which the client voice information belongs includes: comparing and matching the voiceprint features of the client voice information with the client voiceprint features in a voiceprint database, if the matched client voiceprint features are not found in the voiceprint database, determining that the voiceprint features of the client voice information belong to a new client, and storing the voiceprint features of the new client into the voiceprint database; and if the matched client voiceprint characteristics are found in the voiceprint database, determining that the voiceprint characteristics of the client voice information belong to the old client.
The embodiment of the present application further provides a passenger flow analysis method based on speech recognition, including:
obtaining dialogue voice information containing client voice information, and determining client identity information according to voiceprint characteristics of the client voice information;
and carrying out statistical analysis on the client flow information according to the client identity information.
The embodiment of the present application further provides a passenger flow analysis system based on voice recognition, including a server and a plurality of mobile terminals, wherein:
the mobile terminal is used for collecting conversation voice information of a conversation between an employee and a client, determining client voice information containing client voice by identifying employee voiceprint characteristics of the conversation voice information, and sending the client voice information to the server;
and the server is used for determining client identity information according to the client voiceprint characteristics of the client voice information and carrying out statistical analysis on client flow information according to the client identity information.
An embodiment of the present application further provides a server, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method for speech recognition based passenger flow analysis when executing the program.
The embodiment of the application also provides a computer-readable storage medium, which stores computer-executable instructions, and the computer-executable instructions are used for executing the passenger flow analysis method based on the voice recognition.
Compared with the related art, the method comprises the following steps: collecting conversation voice information of a conversation between an employee and a client; determining client voice information containing client voice by identifying employee voice information in the conversation voice information; determining client identity information according to the voiceprint characteristics of the client voice information; and carrying out statistical analysis on the customer flow information according to the customer identity information. According to the embodiment of the application, the passenger flow analysis is carried out based on the voice recognition, the automatic statistical analysis of the passenger flow is realized, the labor cost and the storage space are saved, and management personnel can reasonably distribute human resources according to the passenger flow condition.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. Other advantages of the application may be realized and attained by the instrumentalities and combinations particularly pointed out in the specification, claims, and drawings.
Drawings
The accompanying drawings are included to provide an understanding of the present disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the examples serve to explain the principles of the disclosure and not to limit the disclosure.
FIG. 1 is a flow chart of a method for analyzing passenger flow based on speech recognition according to an embodiment of the present application;
FIG. 2 is a flowchart of step 103 according to an embodiment of the present application;
fig. 3 is a schematic diagram of an implementation of the embodiment of the present application in a manner of combining a mobile terminal and a server;
fig. 4 is a flowchart of a passenger flow analysis method based on speech recognition according to an embodiment of the present application (applied to a mobile terminal);
FIG. 5 is a flowchart of a method for analyzing passenger flow based on speech recognition (applied to a server) according to an embodiment of the present application;
fig. 6 is a schematic diagram of a passenger flow analysis system based on speech recognition according to an embodiment of the present application.
Detailed Description
The present application describes embodiments, but the description is illustrative rather than limiting and it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible within the scope of the embodiments described herein. Although many possible combinations of features are shown in the drawings and discussed in the detailed description, many other combinations of the disclosed features are possible. Any feature or element of any embodiment may be used in combination with or instead of any other feature or element in any other embodiment, unless expressly limited otherwise.
The present application includes and contemplates combinations of features and elements known to those of ordinary skill in the art. The embodiments, features and elements disclosed in this application may also be combined with any conventional features or elements to form a unique inventive concept as defined by the claims. Any feature or element of any embodiment may also be combined with features or elements from other inventive aspects to form yet another unique inventive aspect, as defined by the claims. Thus, it should be understood that any of the features shown and/or discussed in this application may be implemented alone or in any suitable combination. Accordingly, the embodiments are not limited except as by the appended claims and their equivalents. Furthermore, various modifications and changes may be made within the scope of the appended claims.
Further, in describing representative embodiments, the specification may have presented the method and/or process as a particular sequence of steps. However, to the extent that the method or process does not rely on the particular order of steps set forth herein, the method or process should not be limited to the particular sequence of steps described. Other orders of steps are possible as will be understood by those of ordinary skill in the art. Therefore, the particular order of the steps set forth in the specification should not be construed as limitations on the claims. Further, the claims directed to the method and/or process should not be limited to the performance of their steps in the order written, and one skilled in the art can readily appreciate that the sequences may be varied and still remain within the spirit and scope of the embodiments of the present application.
The embodiment of the application provides a passenger flow analysis method and system based on voice recognition, which are used for monitoring the working condition of staff in real time, knowing the number of clients and identifying new and old clients, so that the passenger flow and the client condition are analyzed.
The embodiment of the application can be suitable for passenger flow analysis of off-line stores and can also be used for passenger flow analysis in the fields of telephone shopping, service consultation and the like.
As shown in fig. 1, a method for analyzing passenger flow based on speech recognition in an embodiment of the present application includes:
step 101, collecting dialogue voice information of the staff and the client for dialogue.
And the conversation voice information can be acquired by wearing a terminal with a voice acquisition function by the staff at work.
The staff can be shopping guide staff, telephone operators and other staff who communicate with the customers through voice.
And step 102, identifying employee voice information in the conversation voice information to determine client voice information containing client voice.
In an embodiment, the method further comprises:
and modeling the voiceprint of the employee in advance to obtain an individual voiceprint model of the employee.
The individual voiceprint model is provided with a plurality of voiceprint data, each voiceprint data specifically corresponds to a voiceprint of a plurality of keywords, and the keywords can be set as common words for calling and calling during working of the staff, such as 'good morning', 'good you' and 'asking for questions'. For example, when a shopper faces a customer, the user usually stores voiceprint characteristics of words such as "you' y", "morning good", and "ask" as the start of a conversation in advance in an individual voiceprint model, and can quickly recognize the words when performing identification.
When voice information is subjected to recognition processing, when a section of voice information has a voiceprint feature corresponding to a voiceprint model of a keyword, the section of voice information is considered to be from an employee rather than other unrelated subjects (for example, a recording device capable of playing voice).
In one embodiment, step 102 comprises:
and extracting the characteristics of the dialogue voice information, comparing and matching the obtained voiceprint characteristics with the individual voiceprint models of the employees, marking the voice information which is successfully matched in the dialogue voice information according to the matching result, and determining that the voice information which is not marked in the dialogue voice information is the client voice information.
And when the voiceprint characteristics in the dialogue voice information are matched with the voiceprint characteristics stored in the individual voiceprint model in advance, the identity confirmation information is obtained.
After the collected dialogue voice information is subjected to feature extraction, the voiceprint data in the individual voiceprint model is traversed for matching; in the matching process, a similarity threshold value can be set, matching calculation is carried out by adopting a similarity algorithm, and when the calculation result is within the range of the similarity threshold value, the voice of the corresponding staff can be judged. And if the matching is unsuccessful, discarding the dialogue voice information.
And marking the voice information which is successfully matched in the dialogue voice information.
And judging that the number of speakers is more than 1 (other speaker voices except for the staff in the dialogue voice information) according to the voiceprint characteristics in the dialogue voice information, and determining that the dialogue voice information contains the voice information of the client besides the staff voice information.
Or judging that the voiceprint features in the dialogue voice information are matched with the voiceprint features (employee voices) stored in the individual voiceprint model in advance, and judging that the voiceprint features are not matched with the voiceprint features (client voices) stored in the individual voiceprint model in advance, and determining that the voice information which is not marked in the dialogue voice information is the client voice information.
In the embodiment of the application, matching is performed based on the voiceprint features, and the speech sound information is marked under the condition that matching is successful: if the voiceprint feature is successfully matched with the individual voiceprint model of the employee, the voice information containing the voiceprint feature is marked as employee voice information; after the whole of the dialogue voice information is subjected to the above operation, the dialogue voice information is divided into 2 parts (one part is marked and the other part is unmarked), and the subsequent operation of client analysis is only carried out on the unmarked part (namely, the client voice information).
And 103, determining the identity information of the client according to the voiceprint characteristics of the voice information of the client.
The customer identity information may include a customer type (new customer or old customer).
As shown in fig. 2, in an embodiment, the step 103 includes:
step 201, splitting the dialogue voice information to obtain dialogue segments.
In an embodiment, the dialogue voice information is analyzed, the dialogue interval duration in the dialogue voice information is compared with a preset dialogue blank threshold, the dialogue starting time and the dialogue ending time of a dialogue section are determined, and the dialogue section is obtained by splitting according to the dialogue starting time and the dialogue ending time.
A dialogue blank threshold is preset, and when the dialogue interval duration in the dialogue voice information is less than or equal to the dialogue blank threshold, a dialogue section is determined; and when the conversation interval duration in the conversation voice information is greater than the conversation blank threshold, judging that a section of conversation is ended. For example, in a dialog voice message with a total duration of 2min, there are a plurality of interval durations without dialog voice, but each interval duration is smaller than the dialog blank threshold (for example, it can be set to 15s), and the dialog voice message is determined to be a 2min dialog segment.
Recording the speaking start time point and the speaking end time point of the dialog segment, thereby calculating the dialog time length T of each dialog segmenti(i is the number of dialog segments).
Step 202, extracting the voiceprint characteristics of the client voice information in the dialog.
Wherein the dialog segment includes both tagged speech information (i.e., employee speech information) and untagged speech information (i.e., customer speech information). Employee voice information does not need to be processed.
Step 203, determining the client type to which the client voice information belongs according to the voiceprint characteristics of the client voice information.
Wherein, a voiceprint database is constructed in advance and used for storing the voiceprint characteristics of the client.
In one embodiment, the step 203 comprises:
comparing and matching the voiceprint features of the client voice information with the client voiceprint features in a voiceprint database, if the matched client voiceprint features are not found in the voiceprint database, determining that the voiceprint features of the client voice information belong to a new client, and storing the voiceprint features of the new client into the voiceprint database; and if the matched client voiceprint characteristics are found in the voiceprint database, determining that the voiceprint characteristics of the client voice information belong to the old client.
When the client is judged to be a new client, the voiceprint characteristics are stored in the voiceprint database, and fixed storage time can be set. For example, the voiceprint database may be updated every 2 hours, and new voiceprint features may be stored in the voiceprint database; in this way, if a customer comes to the store once in the morning and comes to the store again in the afternoon, it can be determined as an old customer.
And 104, performing statistical analysis on the customer flow information according to the customer identity information.
In one embodiment, after identifying a new customer or an old customer, the number of customers is counted to obtain customer traffic information, and the customer traffic information can be statistically analyzed in time intervals.
In one embodiment, the storage management of the dialogue voice information comprises the classified storage of the dialogue duration, the ID information, the position information and the voiceprint characteristics of the dialogue voice information.
In step 104, the time period of the peak of the passenger flow can be analyzed according to the time period of the conversation between each employee and the client and the corresponding number of the clients. According to the data, the number of workers can be increased by the management personnel in the peak time period, so that the human resources are reasonably distributed.
The embodiment of the application can distinguish new and old clients through the voiceprint characteristics of the voice recognition clients, analyze the big data and monitor the passenger flow, thereby reasonably adjusting human resources.
As shown in fig. 3, the embodiment of the present application may be implemented in a manner of combining a mobile terminal and a server.
As for the mobile terminal, as shown in fig. 4, the passenger flow analysis method based on speech recognition in the embodiment of the present application includes:
step 401, collecting dialogue voice information of the staff and the client for dialogue.
And the conversation voice information can be acquired by wearing a terminal with a voice acquisition function by the staff at work.
Step 402, identifying staff voice information in the conversation voice information, determining client voice information containing client voice, and sending the client voice information to a server, so that the server determines client traffic information according to the client voice information, and thus performing passenger flow analysis.
In an embodiment, the method further comprises:
and modeling the voiceprint of the employee in advance to obtain an individual voiceprint model of the employee.
The individual voiceprint model is provided with a plurality of voiceprint data, each voiceprint data specifically corresponds to a voiceprint of a plurality of keywords, and the keywords can be set as common words for calling and calling during working of the staff, such as 'good morning', 'good you' and 'asking for questions'. For example, when a shopper faces a customer, the user usually stores voiceprint characteristics of words such as "you' y", "morning good", and "ask" as the start of a conversation in advance in an individual voiceprint model, and can quickly recognize the words when performing identification.
In one embodiment, step 402 includes:
and extracting the characteristics of the dialogue voice information, comparing and matching the obtained voiceprint characteristics with the individual voiceprint models of the employees, marking the voice information which is successfully matched in the dialogue voice information according to the matching result, and determining that the voice information which is not marked in the dialogue voice information is the client voice information.
And when the voiceprint characteristics in the dialogue voice information are matched with the voiceprint characteristics stored in the individual voiceprint model in advance, the identity confirmation information is obtained.
After the collected dialogue voice information is subjected to feature extraction, the voiceprint data in the individual voiceprint model is traversed for matching; in the matching process, a similarity threshold value can be set, matching calculation is carried out by adopting a similarity algorithm, and when the calculation result is within the range of the similarity threshold value, the voice of the corresponding staff can be judged. And if the matching is unsuccessful, discarding the dialogue voice information.
And marking the voice information which is successfully matched in the dialogue voice information.
And judging that the number of speakers is more than 1 (other speaker voices except for the staff in the dialogue voice information) according to the voiceprint characteristics in the dialogue voice information, and determining that the dialogue voice information contains the voice information of the client besides the staff voice information.
And when the speaker is judged to be more than 1 person, the mobile terminal transmits the conversation voice information, the ID (identification) information and the position information to the server side.
The mobile terminals in the embodiment of the application are a plurality of mobile terminals and can be worn on a human body, and each mobile terminal is bound with each shopping guide person in advance through an ID. Therefore, the data sent to the server side includes: dialogue voice information, personal ID information, and location information. When the voiceprint features are matched, the individual voiceprint model corresponding to the mobile terminal user is found through the personal ID information, and then the voiceprint features extracted from the voice information are compared and matched with the voiceprint data in the individual voiceprint model one by one, so that the process of data traversal is reduced, and the speed of voice recognition processing is improved. The position information can be used for positioning to a specific store during data statistical analysis.
For the server, as shown in fig. 5, the method for analyzing passenger flow based on speech recognition in the embodiment of the present application includes:
step 501, obtaining dialogue voice information containing client voice information, and determining client identity information according to voiceprint characteristics of the client voice information.
The server receives conversation voice information containing client voice information sent by the mobile terminal.
As shown in fig. 2, in an embodiment, the step 501 includes:
step 201, splitting the dialogue voice information to obtain dialogue segments.
In an embodiment, the dialogue voice information is analyzed, the dialogue interval duration in the dialogue voice information is compared with a preset dialogue blank threshold, the dialogue starting time and the dialogue ending time of a dialogue section are determined, and the dialogue section is obtained by splitting according to the dialogue starting time and the dialogue ending time.
A dialogue blank threshold is preset, and when the dialogue interval duration in the dialogue voice information is less than or equal to the dialogue blank threshold, a dialogue section is determined; and when the conversation interval duration in the conversation voice information is greater than the conversation blank threshold, judging that a section of conversation is ended. For example, in a piece of speech information with a total duration of 2min, there are a plurality of interval durations without conversation speech, but each interval duration is smaller than a conversation blank threshold (for example, it can be set to 15s), and the conversation speech information is determined to be a 2min conversation segment.
Recording the speaking start time point and the speaking end time point of the dialog segment, thereby calculating the dialog time length T of each dialog segmenti(i is the number of dialog segments).
In step 202, all voiceprint features of the client voice message in the dialog are extracted.
Wherein the dialog segment includes both tagged speech information (i.e., employee speech information) and untagged speech information (i.e., customer speech information). Employee voice information does not need to be processed.
Step 203, determining the client type to which the client voice information belongs according to the voiceprint characteristics of the client voice information.
Wherein, a voiceprint database is constructed in advance and used for storing the voiceprint characteristics of the client.
The customer types may include new customers and old customers. In one embodiment, the step 203 comprises:
comparing and matching the voiceprint features of the client voice information with the client voiceprint features in a voiceprint database, if the matched client voiceprint features are not found in the voiceprint database, determining that the voiceprint features of the client voice information belong to a new client, and storing the voiceprint features of the new client into the voiceprint database; and if the matched client voiceprint characteristics are found in the voiceprint database, determining that the voiceprint characteristics of the client voice information belong to the old client.
When the client is judged to be a new client, the voiceprint characteristics are stored in the voiceprint database, and fixed storage time can be set. For example, the voiceprint database may be updated every 2 hours, and new voiceprint features may be stored in the voiceprint database; in this way, if a customer comes to the store once in the morning and comes to the store again in the afternoon, it can be determined as an old customer.
Step 502, performing statistical analysis on the customer flow information according to the customer identity information.
In one embodiment, after identifying a new customer or an old customer, the number of customers is counted to obtain customer traffic information, and the customer traffic information can be statistically analyzed in time intervals.
In one embodiment, the storage management of the dialogue voice information comprises the classified storage of the dialogue duration, the ID information, the position information and the voiceprint characteristics of the dialogue voice information.
In step 502, the time period of the peak of the passenger flow can be analyzed according to the time period of the conversation between each employee and the client and the corresponding number of the clients. According to the data, the number of workers can be increased by the management personnel in the peak time period, so that the human resources are reasonably distributed.
As shown in fig. 6, an embodiment of the present application provides a system for analyzing passenger flow based on speech recognition, including a server 62 and a plurality of mobile terminals 61, wherein:
the mobile terminal 61 is configured to collect conversation voice information of a conversation between an employee and a client, determine client voice information including client voice by recognizing employee voiceprint characteristics of the conversation voice information, and send the client voice information to the server 62;
the server 62 is configured to determine client identity information according to the client voiceprint feature of the client voice information, and perform statistical analysis on client traffic information according to the client identity information.
The mobile terminal 61 is provided with:
(1) voice acquisition module 611
The voice acquisition module is used for automatically acquiring dialogue voice information.
(2) Speech recognition module 612
The voice recognition module comprises: an individual voiceprint model and an identity recognition unit.
Modeling the voiceprints of all employees in advance to obtain individual voiceprint models; the individual voiceprint model has a plurality of voiceprint data, each voiceprint data corresponds to a voiceprint of a plurality of keywords, and the keywords can be set as commonly used words in work, such as 'good morning', 'you' and 'ask for questions'.
And the identity recognition unit is used for extracting the characteristics of the speech sound information, comparing and matching the speech sound information with the individual voiceprint model, and confirming the identity to obtain identity confirmation information.
(3) Feature marking module 613
And marking the voice information of the identified employee.
(4) Number of people judging module 614
The method is used for judging that the number of the speakers is more than 1 according to the voiceprint characteristics in the dialogue voice information (besides the staff, the voices of other speakers exist in the voice information).
(5) Positioning module 615
For recording the location of the conversational speech so that a particular store can be located at the time of statistical analysis of the data. A GPS (Global Positioning System) module may be employed.
(6) ID module 616
The system is used for recording the identity of the staff who converses voice, so that specific staff can be located during data statistical analysis.
(7) Data transmission module 617
The data transmission module 617 transmits the dialogue voice information to the server in real time when the identity recognition unit confirms the identity and the number of the speaker is judged to be more than 1 person by the number judgment module. In this embodiment, if the voiceprint features extracted from the collected voice information match the voiceprint features already stored in the mobile terminal, and the speaker is greater than 1 person, the data is automatically transmitted to the server in real time.
The mobile terminals of the embodiment are wearable on the bodies of the employees, and each mobile terminal is pre-ID-bound with each employee. Therefore, the data transmitted by the data transmission module 617 to the server side includes: dialogue voice information, personal ID information, and location information.
The server 62 is provided with:
(1) voice processing module 621
Processing the conversation voice information uploaded by the mobile terminal: splitting the dialogue into dialogue sections, calculating dialogue duration of the dialogue sections, and carrying out voiceprint feature extraction on the dialogue sections.
(2) Feature determination module 622
Processing unmarked voice information in the dialogue voice information: and identifying the unmarked dialogue voice information, and comparing and matching the voice information with the voice print characteristics in the voice print database so as to judge the old and new clients.
(3) Counting module 623
The number of the unmarked dialogue voice messages identified by the feature judgment module 622 is counted to obtain the number of the clients entering the store.
(4) Storage management module 624
Manages the data uploaded by the mobile terminal, and constructs a database according to the location information and the ID information of the mobile terminal, the time information of the dialogue voice message, and the voiceprint feature information recognized by the feature determination module 622.
The scheme can identify new and old customers and know the conditions of the store such as the passenger flow volume, the passenger flow peak time period and the like according to the database, thereby being beneficial to reasonably distributing manpower by managers.
It should be noted that in the embodiment of the present application, the passenger flow analysis is implemented by a way of work division and cooperation of the mobile terminal and the server, the mobile terminal performs voice information acquisition and voice recognition verification, and the server is responsible for client traffic statistics and statistical analysis; in other embodiments, the method may also be implemented in other manners, for example, the mobile terminal is only responsible for voice information acquisition, and the server performs voice recognition verification, client traffic statistics, and statistical analysis, and the manner may be to quickly find the individual voiceprint model corresponding to the employee through the personal ID information of the employee, so as to implement quick matching of voiceprint features.
An embodiment of the present application further provides a mobile terminal, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method for speech recognition based passenger flow analysis when executing the program.
An embodiment of the present application further provides a server, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method for speech recognition based passenger flow analysis when executing the program.
The embodiment of the application also provides a computer-readable storage medium, which stores computer-executable instructions, and the computer-executable instructions are used for executing the passenger flow analysis method based on the voice recognition.
In this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

Claims (10)

1. A passenger flow analysis method based on voice recognition is characterized by comprising the following steps:
collecting conversation voice information of a conversation between an employee and a client;
determining client voice information containing client voice by identifying employee voice information in the conversation voice information;
determining client identity information according to the voiceprint characteristics of the client voice information;
and carrying out statistical analysis on the customer flow information according to the customer identity information.
2. The method of speech recognition based passenger flow analysis according to claim 1, further comprising:
and modeling the voiceprint of the employee in advance to obtain an individual voiceprint model of the employee.
3. The method for analyzing passenger flow based on voice recognition according to claim 1, wherein the determining the customer voice information including the customer voice by recognizing the employee voice information in the dialogue voice information comprises:
and extracting the characteristics of the dialogue voice information, comparing and matching the obtained voiceprint characteristics with the individual voiceprint models of the employees, marking the voice information which is successfully matched in the dialogue voice information according to the matching result, and determining that the voice information which is not marked in the dialogue voice information is the client voice information.
4. The method for analyzing passenger flow based on voice recognition according to claim 1, wherein the determining the client identity information according to the voiceprint feature of the client voice information comprises:
splitting the dialogue voice information to obtain dialogue sections;
extracting the voiceprint characteristics of the client voice information in the dialog section;
and determining the client type to which the client voice information belongs according to the voiceprint characteristics of the client voice information.
5. The method of claim 4, wherein the splitting the conversational speech information into conversational segments comprises:
and analyzing the dialogue voice information, comparing the dialogue interval duration in the dialogue voice information with a preset dialogue blank threshold value, determining the dialogue starting time and the dialogue ending time of the dialogue section, and splitting according to the dialogue starting time and the dialogue ending time to obtain the dialogue section.
6. The method of claim 4, wherein the determining the type of the client to which the client voice information belongs according to the voiceprint characteristics of the client voice information comprises: comparing and matching the voiceprint features of the client voice information with the client voiceprint features in a voiceprint database, if the matched client voiceprint features are not found in the voiceprint database, determining that the voiceprint features of the client voice information belong to a new client, and storing the voiceprint features of the new client into the voiceprint database; and if the matched client voiceprint characteristics are found in the voiceprint database, determining that the voiceprint characteristics of the client voice information belong to the old client.
7. A passenger flow analysis method based on voice recognition is characterized by comprising the following steps:
obtaining dialogue voice information containing client voice information, and determining client identity information according to voiceprint characteristics of the client voice information;
and carrying out statistical analysis on the client flow information according to the client identity information.
8. A passenger flow analysis system based on voice recognition is characterized by comprising a server and a plurality of mobile terminals, wherein:
the mobile terminal is used for collecting conversation voice information of a conversation between an employee and a client, determining client voice information containing client voice by identifying employee voiceprint characteristics of the conversation voice information, and sending the client voice information to the server;
and the server is used for determining client identity information according to the client voiceprint characteristics of the client voice information and carrying out statistical analysis on client flow information according to the client identity information.
9. A server, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the method as claimed in claim 7 when executing the program.
10. A computer-readable storage medium storing computer-executable instructions for performing the method of any one of claims 1-7.
CN201911018711.0A 2019-10-24 2019-10-24 Passenger flow analysis method and system based on voice recognition Pending CN110827829A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911018711.0A CN110827829A (en) 2019-10-24 2019-10-24 Passenger flow analysis method and system based on voice recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911018711.0A CN110827829A (en) 2019-10-24 2019-10-24 Passenger flow analysis method and system based on voice recognition

Publications (1)

Publication Number Publication Date
CN110827829A true CN110827829A (en) 2020-02-21

Family

ID=69550473

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911018711.0A Pending CN110827829A (en) 2019-10-24 2019-10-24 Passenger flow analysis method and system based on voice recognition

Country Status (1)

Country Link
CN (1) CN110827829A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626813A (en) * 2020-04-22 2020-09-04 北京健康之家科技有限公司 Product recommendation method and system
CN111933151A (en) * 2020-08-16 2020-11-13 云知声智能科技股份有限公司 Method, device and equipment for processing call data and storage medium
CN112734467A (en) * 2020-12-31 2021-04-30 北京明略软件系统有限公司 Passenger flow prediction method and system for offline service scene
CN113240347A (en) * 2021-06-17 2021-08-10 恩亿科(北京)数据科技有限公司 Service behavior data analysis method, system, storage medium and electronic device

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105244031A (en) * 2015-10-26 2016-01-13 北京锐安科技有限公司 Speaker identification method and device
CN105895101A (en) * 2016-06-08 2016-08-24 国网上海市电力公司 Speech processing equipment and processing method for power intelligent auxiliary service system
CN106683678A (en) * 2016-11-30 2017-05-17 厦门快商通科技股份有限公司 Artificial telephone customer service auxiliary system and method
CN106782496A (en) * 2016-11-15 2017-05-31 北京科技大学 A kind of crowd's Monitoring of Quantity method based on voice and intelligent perception
CN107341685A (en) * 2017-05-24 2017-11-10 百度在线网络技术(北京)有限公司 Data analysing method and device
CN107766817A (en) * 2017-10-17 2018-03-06 广东码识图信息科技有限公司 Passenger flow analysing methods, devices and systems based on living things feature recognition
CN108335695A (en) * 2017-06-27 2018-07-27 腾讯科技(深圳)有限公司 Sound control method, device, computer equipment and storage medium
CN108766439A (en) * 2018-04-27 2018-11-06 广州国音科技有限公司 A kind of monitoring method and device based on Application on Voiceprint Recognition
CN108805111A (en) * 2018-09-07 2018-11-13 杭州善贾科技有限公司 A kind of detection of passenger flow system and its detection method based on recognition of face
CN109657186A (en) * 2018-12-27 2019-04-19 广州势必可赢网络科技有限公司 A kind of demographic method, system and relevant apparatus
CN109697556A (en) * 2018-12-12 2019-04-30 深圳市沃特沃德股份有限公司 Evaluate method, system and the intelligent terminal of effect of meeting

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105244031A (en) * 2015-10-26 2016-01-13 北京锐安科技有限公司 Speaker identification method and device
CN105895101A (en) * 2016-06-08 2016-08-24 国网上海市电力公司 Speech processing equipment and processing method for power intelligent auxiliary service system
CN106782496A (en) * 2016-11-15 2017-05-31 北京科技大学 A kind of crowd's Monitoring of Quantity method based on voice and intelligent perception
CN106683678A (en) * 2016-11-30 2017-05-17 厦门快商通科技股份有限公司 Artificial telephone customer service auxiliary system and method
CN107341685A (en) * 2017-05-24 2017-11-10 百度在线网络技术(北京)有限公司 Data analysing method and device
CN108335695A (en) * 2017-06-27 2018-07-27 腾讯科技(深圳)有限公司 Sound control method, device, computer equipment and storage medium
CN107766817A (en) * 2017-10-17 2018-03-06 广东码识图信息科技有限公司 Passenger flow analysing methods, devices and systems based on living things feature recognition
CN108766439A (en) * 2018-04-27 2018-11-06 广州国音科技有限公司 A kind of monitoring method and device based on Application on Voiceprint Recognition
CN108805111A (en) * 2018-09-07 2018-11-13 杭州善贾科技有限公司 A kind of detection of passenger flow system and its detection method based on recognition of face
CN109697556A (en) * 2018-12-12 2019-04-30 深圳市沃特沃德股份有限公司 Evaluate method, system and the intelligent terminal of effect of meeting
CN109657186A (en) * 2018-12-27 2019-04-19 广州势必可赢网络科技有限公司 A kind of demographic method, system and relevant apparatus

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626813A (en) * 2020-04-22 2020-09-04 北京健康之家科技有限公司 Product recommendation method and system
CN111626813B (en) * 2020-04-22 2023-09-29 北京水滴科技集团有限公司 Product recommendation method and system
CN111933151A (en) * 2020-08-16 2020-11-13 云知声智能科技股份有限公司 Method, device and equipment for processing call data and storage medium
CN112734467A (en) * 2020-12-31 2021-04-30 北京明略软件系统有限公司 Passenger flow prediction method and system for offline service scene
CN113240347A (en) * 2021-06-17 2021-08-10 恩亿科(北京)数据科技有限公司 Service behavior data analysis method, system, storage medium and electronic device

Similar Documents

Publication Publication Date Title
CN110827829A (en) Passenger flow analysis method and system based on voice recognition
CN105575391B (en) Voiceprint information management method and device and identity authentication method and system
CN111091832B (en) Intention assessment method and system based on voice recognition
CN110533288A (en) Business handling process detection method, device, computer equipment and storage medium
CN111401826A (en) Double-recording method and device for signing electronic contract, computer equipment and storage medium
US11245791B2 (en) Detecting robocalls using biometric voice fingerprints
US10511712B2 (en) Caller identification in a secure environment using voice biometrics
CN109829691B (en) C/S card punching method and device based on position and deep learning multiple biological features
CN109389007B (en) Passenger flow volume statistical method, device and system
CN110246503A (en) Blacklist vocal print base construction method, device, computer equipment and storage medium
WO2014140970A2 (en) Voice print tagging of interactive voice response sessions
CN112200556A (en) Transfer processing method and device of automatic teller machine
CN111312286A (en) Age identification method, age identification device, age identification equipment and computer readable storage medium
CN112509586A (en) Method and device for recognizing voice print of telephone channel
CN110767237A (en) Voice transmission method and device, first interphone and system
CN111047358A (en) Member information query method and system based on face recognition
CN112562644A (en) Customer service quality inspection method, system, equipment and medium based on human voice separation
CN110797030B (en) Method and system for working hour statistics based on voice recognition
CN112071315A (en) Portable information acquisition device, method, storage medium and electronic device
CN110556114B (en) Speaker identification method and device based on attention mechanism
CN111460210B (en) Target voice processing method and device
CN114257688A (en) Telephone fraud identification method and related device
CN112861816A (en) Abnormal behavior detection method and device
CN113365100A (en) Video processing method and device
CN113537073A (en) Method and system for accurately processing special events in business hall

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210816

Address after: 200232 unit 5b06, floor 5, building 2, No. 277, Longlan Road, Xuhui District, Shanghai

Applicant after: Shanghai Mingsheng Pinzhi Artificial Intelligence Technology Co.,Ltd.

Address before: 100102 room 321008, 5 building, 1 Tung Fu Street East, Chaoyang District, Beijing.

Applicant before: MIAOZHEN INFORMATION TECHNOLOGY Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200221