CN114912463B

CN114912463B - Conference automatic recording method, system, readable storage medium and computer equipment

Info

Publication number: CN114912463B
Application number: CN202210818164.XA
Authority: CN
Inventors: 邱晓健; 邱正峰; 连峰; 崔韧; 杨光
Original assignee: Nanchang Hang Tian Guang Xin Technology Co ltd
Current assignee: Nanchang Hang Tian Guang Xin Technology Co ltd
Priority date: 2022-07-13
Filing date: 2022-07-13
Publication date: 2022-10-25
Anticipated expiration: 2042-07-13
Also published as: CN114912463A

Abstract

The invention provides a conference automatic recording method, a system, a readable storage medium and computer equipment, wherein the method comprises the following steps: acquiring key information of each time point on a time line of a conference, and enabling each piece of key information to correspond to each piece of terminal equipment through a terminal configuration file; performing text conversion on each key information and the equipment information of the corresponding terminal equipment to obtain text information; carrying out semantic recognition on each text message in sequence to generate subject information of the conference and a conference record source file; configuring the subject information and the time line identification of the conference into a recognition title of the conference record source file, and storing the configured conference record source file into a conference record database. The invention corresponds the key information of the conference to the terminal equipment through the terminal configuration file, converts the text of the key information into the text information, adds the time line identification in the text information, converts the text information into the subject information and the conference recording source file, and solves the problem brought by manual recording.

Description

Conference automatic recording method, system, readable storage medium and computer equipment

Technical Field

The invention relates to the technical field of data processing, in particular to an automatic conference recording method, an automatic conference recording system, a readable storage medium and computer equipment.

Background

In general, companies develop important conference programs for groups of projects in order to guarantee the operation of the projects, and with the rapid development of science and technology, conference formats are becoming more and more diversified.

In the process of meeting, meeting recorders are usually required to be equipped to record meeting information of each person in the meeting, but with the increase of the number of the persons participating in the meeting, the problems of improper matching of meeting records and persons, missing of meeting records and inaccurate meeting record information often exist in an artificial recording mode, and the problem that the meeting records cannot be checked due to the fact that the papery meeting records are lost or improperly stored exists.

Disclosure of Invention

Based on this, the object of the present invention is to provide an automatic conference recording method, system, readable storage medium and computer device, so as to solve at least the deficiencies in the above-mentioned technologies.

The invention provides an automatic conference recording method, which comprises the following steps:

acquiring key information of each time point on a time line of a conference, and performing terminal matching on each piece of key information through a preset terminal configuration file so as to enable each piece of key information to correspond to each piece of terminal equipment in the conference;

performing text conversion on each piece of key information and equipment information of corresponding terminal equipment to obtain corresponding text information, wherein the text information has a timeline identification of the conference;

carrying out semantic recognition on each text message in sequence, and generating subject information and a conference record source file of the conference according to a preset keyword extraction rule and a text arrangement rule;

and configuring the theme information and the timeline identification of the conference as a recognition title of the conference record source file, and storing the configured conference record source file in a conference record database.

Further, after the step of storing the configured meeting record source file in the meeting record database, the method further comprises:

when a viewing request of a user for a conference record is acquired, a template selection signal of the user and an identification code of the conference record in the viewing request are analyzed;

reading the identification code of the conference record, and judging whether the identification code of the conference record meets the preset requirement;

and if the identification code of the conference record meets the preset requirement, selecting a corresponding template from a preset template library according to the template selection signal, and calling a conference record source file corresponding to the conference record identification code from the conference record database to be led into the template so as to enable the user to check the conference record source file.

Further, before the step of acquiring the key information at each time point on the timeline of the conference, the method further includes:

acquiring user information and control codes sent by a plurality of equipment terminals in a conference, wherein the user information at least comprises user identifications;

numbering each terminal device according to each user identifier and each control code, and constructing a terminal configuration file according to the numbered device terminals.

Further, the step of acquiring key information at each time point on the timeline of the conference includes:

acquiring voice data input by a user terminal in the conference in real time at each time point on a time line of the conference;

performing voice decomposition on the voice data through a preset voice database to obtain a plurality of voice fragments of the voice data;

and preprocessing each voice fragment, and constructing key information according to the preprocessed voice fragments.

Further, the step of preprocessing each of the speech segments and constructing key information according to the preprocessed speech segments includes:

sequentially carrying out voice decoding on each voice segment to obtain syllable data of each voice segment;

carrying out data correction on the syllable data of each voice segment by using a preset wrong word library to generate correct syllable data of each voice segment;

and recombining the correct syllable data of each voice fragment into a correct voice fragment, and generating key information through the correct voice fragment.

Further, the step of acquiring key information at each time point on the timeline of the conference further includes:

acquiring action data input by a user terminal in the conference in real time at each time point on a time line of the conference;

performing action decomposition on the action data through a preset action database to obtain a plurality of action frames of the action data;

performing feature detection on each action frame to obtain feature point data of each action frame;

and determining character information corresponding to the feature point data of each action frame according to the corresponding relation between the standard feature point data and the character information, and converting each character information into corresponding key information.

The invention also provides an automatic conference recording system, which comprises:

the matching module is used for acquiring key information of each time point on a time line of a conference and performing terminal matching on each piece of key information through a preset terminal configuration file so as to enable each piece of key information to correspond to each piece of terminal equipment in the conference;

the conversion module is used for performing text conversion on each piece of key information and the equipment information of the corresponding terminal equipment to obtain corresponding text information, and the text information has the timeline identification of the conference;

the identification module is used for carrying out semantic identification on each text message in sequence and generating subject information and a conference record source file of the conference according to a preset keyword extraction rule and a text arrangement rule;

and the configuration module is used for configuring the theme information and the time line identifier of the conference into the identification title of the conference record source file and storing the configured conference record source file in a conference record database.

Further, the system further comprises:

the analysis module is used for analyzing a template selection signal of a user and an identification code of the conference record in a viewing request when the viewing request of the user for the conference record is obtained;

the judging module is used for reading the identification code of the conference record and judging whether the identification code of the conference record meets the preset requirement;

and the calling module is used for selecting a corresponding template from a preset template library according to the template selection signal if the identification code of the conference record meets a preset requirement, and calling a conference record source file corresponding to the conference record identification code from the conference record database to be led into the template so as to enable the user to check the conference record source file.

Further, the system further comprises:

the system comprises an acquisition module, a processing module and a control module, wherein the acquisition module is used for acquiring user information and control codes sent by a plurality of equipment terminals in a conference, and the user information at least comprises user identifications;

and the processing module is used for numbering each terminal device according to each user identifier and each control code and constructing a terminal configuration file according to the numbered device terminal.

Further, the matching module comprises:

the first acquisition unit is used for acquiring voice data input by a user terminal in the conference in real time at each time point on a time line of the conference;

the first decomposition unit is used for carrying out voice decomposition on the voice data through a preset voice database to obtain a plurality of voice fragments of the voice data;

and the first processing unit is used for preprocessing each voice fragment and constructing key information according to a plurality of preprocessed voice fragments.

Further, the first processing unit is specifically configured to:

and recombining the correct syllable data of each voice segment into a correct voice segment, and generating key information through the correct voice segment.

Further, the matching module further comprises:

the second acquisition unit is used for acquiring action data input by the user terminal in the conference in real time at each time point on a time line of the conference;

the second decomposition unit is used for performing action decomposition on the action data through a preset action database to obtain a plurality of action frames of the action data;

the characteristic detection unit is used for carrying out characteristic detection on each action frame to obtain characteristic point data of each action frame;

and the second processing unit is used for determining the character information corresponding to the feature point data of each action frame according to the corresponding relation between the standard feature point data and the character information and converting each character information into corresponding key information.

The invention also proposes a readable storage medium on which a computer program is stored which, when executed by a processor, implements the automatic recording method of a conference described above.

The invention also provides a computer device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize the automatic conference recording method.

According to the automatic recording method, the system, the readable storage medium and the computer equipment for the conference, the key information of each time point on the time line of the conference is matched and corresponds to each terminal equipment in the conference through the terminal configuration file, so that each key information in the conference is matched with the corresponding terminal equipment, and the phenomenon that the conference record is not matched with personnel properly is avoided; meanwhile, the key information text is converted into text information, and a timeline identification is added in the text information, so that the time dislocation of the conference record is avoided; text information is converted into corresponding subject information and a conference recording source file through semantic recognition, a keyword extraction rule and a text sorting rule, and therefore the problem caused by manual recording is solved; the conference record source file is stored in the conference record database, so that the conference staff can conveniently consult the conference at a later time.

Drawings

Fig. 1 is a flowchart of an automatic conference recording method according to a first embodiment of the present invention;

FIG. 2 is a detailed flowchart of step S101 in FIG. 1;

FIG. 3 is a detailed flowchart of step S113 in FIG. 2;

FIG. 4 is another detailed flowchart of step S101 in FIG. 1;

fig. 5 is a flowchart of an automatic conference recording method according to a second embodiment of the present invention;

fig. 6 is a block diagram showing the structure of an automatic conference recording system according to a third embodiment of the present invention;

fig. 7 is a block diagram showing a configuration of a computer device according to a fourth embodiment of the present invention.

Description of the main element symbols:

memory device	10	Conversion module	12
				Processor with a memory having a plurality of memory cells	20	IdentificationModule	13
Computer program	30	Configuration module	14
				Matching module	11

The following detailed description will further illustrate the invention in conjunction with the above-described figures.

Detailed Description

To facilitate an understanding of the invention, the invention will now be described more fully with reference to the accompanying drawings. Several embodiments of the invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.

It will be understood that when an element is referred to as being "secured to" another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. The terms "vertical," "horizontal," "left," "right," and the like as used herein are for illustrative purposes only.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

Example one

Referring to fig. 1, an automatic conference recording method according to a first embodiment of the present invention is shown, where the method specifically includes steps S101 to S104:

s101, acquiring key information of each time point on a time line of a conference, and performing terminal matching on each key information through a preset terminal configuration file so as to enable each key information to correspond to each terminal device in the conference;

further, referring to fig. 2, step S101 specifically includes steps S111 to S113:

s111, acquiring voice data input by a user terminal in the conference in real time at each time point on a time line of the conference;

s112, carrying out voice decomposition on the voice data through a preset voice database to obtain a plurality of voice fragments of the voice data;

s113, preprocessing each voice fragment, and constructing key information according to the preprocessed voice fragments.

In a specific implementation, at a plurality of time points on a conference timeline, a conference system may acquire, in real time, voice data input by a user terminal in a conference, where the conference timeline is associated with a conference time, the user terminal includes, but is not limited to, a microphone, and in other embodiments, the user terminal may also be a terminal device capable of transmitting audio and video signals, such as a mobile phone, a tablet computer, a notebook computer, and the like; the data input through the user terminal may also be motion data, picture data, etc.

Specifically, the conference system decomposes the voice data through a preset voice database to obtain a plurality of voice segments of the voice data; various voice sentence-breaking vocabularies are prestored in the voice database, all speaking habits of people and normal semantic sentence-breaking vocabularies are recorded in the voice sentence-breaking vocabularies, and sentences are broken on the voice data through the various voice sentence-breaking vocabularies to obtain each voice segment of the voice data, for example: the voice data input by the user through the terminal device is a plurality of voice segments, such as 'I think that people are dispatched to a local area along with the result of market research after the market research is needed firstly aiming at the project and whether the market is expanded or not is determined according to the business condition of the local area', the conference system punctuates the voice data into a plurality of voice segments, such as 'I think that people are thought to be aiming at the project', 'market research is needed firstly', 'I think that the market research is needed firstly', 'local area along with the result of market research', 'then people are dispatched to the local area along with the result of market research', 'personnel are dispatched', 'and whether the market is expanded or not is determined according to the business condition of the local area'.

It should be noted that, in this embodiment, the step of decomposing the speech data includes, but is not limited to, performing sentence-breaking on the speech data, and in some optional embodiments, operations such as noise reduction and semantic recognition may also be performed on the speech data.

Furthermore, the plurality of voice segments are integrated to obtain corresponding key information. For example: and the plurality of voice fragments are: "regarding this item", "i think of" regarding this item "," need to do market research first "," i think of need to do market research first "," aim at local area with the result of market research "," dispatch of person with local area with the result of market research later "," dispatch of person ", and" decide whether to enlarge market according to the business situation of this local area "are integrated, unnecessary words are removed, and key information is obtained: "for this item, it is necessary to first conduct market research, send out a person to a local area as a result of the market research, and determine whether or not to expand the market according to business conditions of the local area".

In other embodiments, the preprocessing step may also be voice detection and recognition, motion or behavior detection and recognition, detection and recognition of information focused by multiple conference participants, presentation of a specific product, amplification of a sound of a specific speaker, partial amplification of information of a specific product, and the like.

After the corresponding key information is obtained, the matching is performed on the key information through the pre-stored terminal configuration file, and it can be understood that in the conference, the key information transmitted by the user through the terminal device needs to be corresponding, so that the problem that roles and the terminal devices cannot be distinguished in the conference record can be effectively avoided.

Specifically, referring to fig. 3, in some optional embodiments, the step S113 specifically includes the following steps S1131 to S1133:

s1131, performing voice decoding on each voice segment in sequence to obtain syllable data of each voice segment;

s1132, performing data correction on the syllable data of each voice fragment by using a preset wrong word library to generate correct syllable data of each voice fragment;

s1133, recombining the correct syllable data of each voice fragment into a correct voice fragment, and generating key information through the correct voice fragment.

Before the implementation, a plurality of speech segments are obtained according to the above steps, for example: the voice data input by the user through the terminal equipment is that the user needs to do four-field investigation at first, and the conference system makes the voice data sentence break into voice segments such as that the user needs to do four-field investigation at first and that the user needs to do four-field investigation through various voice sentence breaking words.

In a specific implementation, the voice segments are sequentially subjected to voice decoding to obtain syllable data of each voice segment, for example:

firstly, four-field investigation is needed to be carried out, and the voice is decoded into 'sh' and 'ou'; 'x', 'ian'; 'x', 'u'; 'y', 'ao'; 'j', 'in'; 'x', 'ing'; ' s ', ' i ' c ', ' hand '; 'd', 'iao'; the syllable data of 'y' and 'an'.

Specifically, the conference system performs data correction on the syllable data of each voice segment through a preset wrong word library to obtain correct syllable data of each syllable segment, recombines the corrected correct syllable data and other syllable data into a correct voice segment, and combines the correct voice segment into key information. For example: mixing the above three parts of 'sh' and 'ou'; 'x', 'ian'; 'x', 'u'; 'y', 'ao'; 'j', 'in'; 'x', 'ing'; ' s ', ' i ' c ', ' hand '; 'd', 'iao'; the 'y' and 'an' syllable data are identified by a wrong word bank and corrected by data, and the's' and 'i' are corrected; correcting four syllable data of 'c', 'hand' into 'sh' and 'i'; ' c ' and ' hand ', and the corrected ' ″ ' sh ' and ' i '; the syllable data of 'c' and 'hang' and other syllable data are recombined to 'firstly, market research needs to be carried out' on a voice fragment, unnecessary vocabularies are removed, and then key information is obtained: "need market research".

It can be understood that the wrong word bank can be formed by performing analog input on various wrong words and collecting the wrong words according to the analog input wrong words, and then storing the wrong words into the database.

Further, the conference is usually in the form of a video conference, a live conference, and the like, and the information transmitted through the user terminal includes voice information, motion information, picture information, and the like regardless of the video conference and the live conference.

Referring to fig. 4, in other embodiments, the step S101 further includes steps S121 to S124:

s121, acquiring action data input by a user terminal in the conference in real time at each time point on a time line of the conference;

s122, performing action decomposition on the action data through a preset action database to obtain a plurality of action frames of the action data;

s123, performing feature detection on each action frame to obtain feature point data of each action frame;

and S124, determining character information corresponding to the feature point data of each action frame according to the corresponding relation between the standard feature point data and the character information, and converting each character information into corresponding key information.

In a specific implementation, at a plurality of time points on a conference timeline, a conference system may obtain, in real time, motion data input by a user terminal in a conference, where in this embodiment, the motion data is a gesture motion, the conference timeline is associated with a conference time, the user terminal includes, but is not limited to, a microphone, and in other embodiments, the user terminal may also be a terminal device capable of transmitting audio and video signals, such as a mobile phone, a tablet computer, a notebook computer, and the like.

Specifically, the conference system decomposes the gesture actions through a preset action database to obtain a plurality of action frames of the action data, the decomposition action frames of various gesture actions are prestored in the action database, the gesture actions are compared with the various gesture actions in the action database to obtain the action frames of the gesture actions, and the action frames are correspondingly converted into a plurality of action frame images.

Further, inputting each action frame image into the neural network model which is completed with learning to perform feature detection, obtaining corresponding feature point data, performing edge extraction and feature extraction on the action frame in each action frame image according to an image encoder in the neural network model to obtain an image boundary of the action frame in the action frame image and pixel point data of the action frame image, and converting the image boundary and the pixel point data of the action frame image into corresponding key information by combining standard pixel point data and the corresponding relation between the corresponding standard action frame image and character information.

In other embodiments, in the step of performing edge extraction and feature extraction on the motion frame image, smoothing processing, transformation processing, enhancement processing, restoration processing, and filtering processing may be performed on the motion frame image.

S102, performing text conversion on each piece of key information and equipment information of corresponding terminal equipment to obtain corresponding text information, wherein the text information has a time line identifier of the conference;

in specific implementation, the obtained key information and the device information of the terminal device corresponding to the key information are subjected to text conversion, when text conversion is performed, a corresponding relation is established between each key information and a standard file, so that text information corresponding to each key information is obtained, a timeline identification of the conference is added to the obtained text information, and by adding the timeline identification, the text information and the timeline can be corresponding, so that on one hand, subsequent conference records can be sorted according to the timeline identification, on the other hand, the text information of each terminal device can be distinguished, and the text information of the same terminal device can be distinguished, for example: the format of the text information is as follows:

time (e.g.: 2022.4.1-9-30-AM) -name-device number;

text: first, a market research is required.

S103, performing semantic recognition on each text message in sequence, and generating topic information and a conference record source file of the conference according to a preset keyword extraction rule and a preset text arrangement rule;

in specific implementation, semantic recognition is carried out on each text message based on a preset topic model, and keywords in the text messages are extracted according to a keyword extraction rule so as to generate topic information of the conference; and meanwhile, sequencing each text message by using a text sequencing rule to generate a conference record source file.

It can be understood that the keyword extraction rule can input various keywords in the database, integrate the various keywords, and input the text information into the database for retrieval when extracting the keywords, so as to obtain the keywords corresponding to the text information; the text sorting rule may be identified according to the time line of the conference, or may be sorted according to the number of devices or names of users participating in the conference, and in other alternative embodiments, the text sorting rule may be set by the user.

For example:

2022.4.1-9:30: 15-AM-chenzhisomebody-S001;

firstly, market research is required;

extracting keywords in the text information by a keyword extraction rule to obtain 'market research', obtaining theme information corresponding to the keywords by combining a preset theme model, namely 'project discussion meeting', sequencing timeline identifications in the text information, and further obtaining a conference record source file of the whole conference.

S104, configuring the theme information and the time line identification of the conference into an identification title of the conference record source file, and storing the configured conference record source file into a conference record database.

In specific implementation, the obtained subject information and the obtained timeline identification are used as the identification title of the conference record source file, and the configured conference record source file is stored in a conference record database so as to facilitate subsequent search and user reference of the timeline and the subject.

In summary, in the automatic recording method for a conference in the above embodiment of the present invention, the terminal configuration file matches and corresponds the key information of each time point on the timeline of the conference with each terminal device in the conference, so that each key information in the conference is matched with the corresponding terminal device, and the phenomenon that the conference record is not properly matched with the personnel is avoided; meanwhile, the key information text is converted into text information, and a timeline identification is added into the text information, so that the situation that the conference records are staggered in time is avoided; text information is converted into corresponding subject information and a conference recording source file through semantic recognition, a keyword extraction rule and a text sorting rule, and therefore the problem caused by manual recording is solved; the conference record source file is stored in a conference record database, so that the conference staff can conveniently look up the conference record source file at a later stage.

Example two

Referring to fig. 5, an automatic conference recording method according to a second embodiment of the present invention is shown, where the method specifically includes steps S201 to S209:

s201, acquiring user information and control codes sent by a plurality of equipment terminals in a conference, wherein the user information at least comprises user identifications;

s202, numbering each terminal device according to each user identifier and each control code, and constructing a terminal configuration file according to the device terminal with the numbering completed;

in specific implementation, a conference system acquires user information and control codes sent by a plurality of terminal devices in a conference through a serial port protocol, and then binds the user with the terminal device sending the user information by using the user information and the control codes, and simultaneously numbers the bound terminal devices, and after the numbering is finished, a mapping file, namely a terminal configuration file, is constructed aiming at the terminal device and a user identifier.

It should be noted that the user information may be filled in by the user through the terminal device, and transmitted to the conference system by the terminal device, or the user information may be entered into each terminal device by a background person according to a participant, and when the user uses the terminal device, the user information in the terminal device is determined to be correct through recognition modes such as a human face or voice, so as to complete the binding between the terminal device and the user information.

For example: when a certain user inputs user information (name) by using a terminal device in a conference, the terminal device sends the user information and a control code of the terminal device to a conference system, and then the terminal device is bound with the certain user identity by the control code, and the terminal device is numbered as S001;

similarly, when user information (name) is input to another terminal device in a certain user conference, the terminal device is bound to the identity of the certain user conference, and the terminal device is numbered as S002.

It should be understood that when the user inputs the user information, the terminal device may confirm the authenticity of the user information according to the list of the participants, thereby preventing the user from missing the conference.

S203, acquiring key information of each time point on a time line of a conference, and performing terminal matching on each key information through a preset terminal configuration file so as to enable each key information to correspond to each terminal device in the conference;

in a specific implementation, at a plurality of time points on a conference timeline, a conference system may acquire, in real time, voice data input by a user terminal in a conference, where the conference timeline is associated with a conference time, the user terminal includes, but is not limited to, a microphone, and in other embodiments, the user terminal may also be a terminal device capable of transmitting audio and video signals, such as a mobile phone, a tablet computer, a notebook computer, and the like; the data input through the user terminal may also be motion data, picture data, or the like.

Specifically, the conference system decomposes the voice data through a preset voice database to obtain a plurality of voice segments of the voice data; various voice sentence-breaking vocabularies are prestored in the voice database, all speaking habits of people and normal semantic sentence-breaking vocabularies are recorded in the voice sentence-breaking vocabularies, and sentences are broken on the voice data through the various voice sentence-breaking vocabularies to obtain each voice segment of the voice data, for example: the voice data input by the user through the terminal device is 'for the project, the user considers that the project needs to be subjected to market research firstly, then sends people to a local area along with the result of the market research and determines whether to expand the market according to the business condition of the local area', the conference system punctuates the voice data into a plurality of voice segments such as 'for the project, the user considers that the project needs to be subjected to market research firstly', 'I considers that the market research needs to be subjected to firstly', 'for the local area along with the result of the market research', 'then sends people to the local area along with the result of the market research', 'sends people, and determines whether to expand the market according to the business condition of the local area'.

Furthermore, the plurality of voice segments are integrated to obtain corresponding key information. For example: and the plurality of voice fragments are: "regarding this item", "i think of" regarding this item "," need to do market research first "," i think of need to do market research first "," aim at local area with the result of market research "," dispatch of person with local area with the result of market research later "," dispatch of person ", and" decide whether to enlarge market according to the business situation of this local area "are integrated, unnecessary words are removed, and key information is obtained: "for this item, market research is first required, and as a result of the market research, a person is dispatched to a local area to determine whether or not to expand the market according to the business situation of the local area".

In other embodiments, the preprocessing step may also be voice detection and recognition, motion or behavior detection and recognition, detection and recognition of information focused by multiple conference participants, demonstration of a specific product, amplification of a sound of a specific speaker, and partial amplification representation of information of a specific product.

firstly, four-field investigation is needed to be carried out, and the voice is decoded into sh and ou; 'x', 'ian'; 'x', 'u'; 'y', 'ao'; 'j', 'in'; 'x', 'ing'; ' s ', ' i ' c ', ' hand '; 'd', 'iao'; the 'y' and 'an' syllable data.

Specifically, the conference system performs data correction on the syllable data of each voice segment through a preset wrong word library to obtain correct syllable data of each syllable segment, recombines the corrected correct syllable data and other syllable data into a correct voice segment, and recombines key information through the correct voice segment. For example: mixing the above three parts of 'sh' and 'ou'; 'x', 'ian'; 'x', 'u'; 'y', 'ao'; 'j', 'in'; 'x', 'ing'; ' s ', ' i ' c ', ' hand '; 'd', 'iao'; the 'y' and 'an' syllable data are identified through a wrong word bank and corrected through data, and the's' and 'i' are corrected; correcting four syllable data of 'c', 'hand' into 'sh' and 'i'; 'c', 'hand', corrected 'sh', 'i'; the 'c' and 'hand' syllable data are recombined with other syllable data, and the market research is firstly needed to be carried out on the voice fragment, unnecessary words are removed, and further key information is obtained: "need market research".

After the corresponding key information is obtained, the matching of the key information is performed through the pre-stored terminal configuration file, and it can be understood that the key information transmitted by the user through the terminal equipment needs to be corresponding in the conference, so that the problem that the roles and the terminal equipment cannot be distinguished in the conference record can be effectively avoided.

In some optional embodiments, in this step, motion data input by the user terminal may also be acquired.

In a specific implementation, at multiple time points on a conference timeline, a conference system may obtain, in real time, motion data input by a user terminal in a conference, where in this embodiment, the motion data is a gesture motion, the conference timeline is associated with a conference time, the user terminal includes, but is not limited to, a microphone, and in other embodiments, the user terminal may also be a terminal device capable of transmitting audio and video signals, such as a mobile phone, a tablet computer, a notebook computer, and the like.

Specifically, the conference system decomposes the gesture actions through a preset action database to obtain a plurality of action frames of the action data, the decomposition action frames of various gesture actions are prestored in the action database, the gesture actions are compared with various gesture actions in the action database to obtain action frames of the gesture actions, and each action frame is correspondingly converted into a plurality of action frame images.

In another embodiment, in the step of performing edge extraction and feature extraction on the motion frame image, smoothing processing, transformation processing, enhancement processing, restoration processing, and filtering processing may be performed on the motion frame image.

S204, performing text conversion on each piece of key information and the equipment information of the corresponding terminal equipment to obtain corresponding text information, wherein the text information has a time line identifier of the conference;

time (e.g.: 2022.4.1-9-30-AM) -name-device number;

text: first, a market research is required.

S205, performing semantic recognition on each text message in sequence, and generating subject information and a conference record source file of the conference according to a preset keyword extraction rule and a preset text arrangement rule;

It can be understood that the keyword extraction rule can input various keywords in the database, integrate the various keywords, and input the text information into the database for retrieval when extracting the keywords, so as to obtain the keywords corresponding to the text information; the text ordering rule may be identified according to the timeline of the conference, or may be ordered according to the number of devices or names of users participating in the conference, and in other optional embodiments, the text ordering rule may be set by the user.

For example:

2022.4.1-9:30: 15-AM-chen somebody-S001;

firstly, market research is needed;

S206, configuring the theme information and the time line identifier of the conference into an identification title of the conference record source file, and storing the configured conference record source file into a conference record database;

in specific implementation, the obtained topic information and the timeline identification are used as identification titles of the conference record source file, and the configured conference record source file is stored in a conference record database, so that follow-up searching and users can look up the timeline and the topic conveniently.

S207, when a viewing request of a user for a conference record is acquired, a template selection signal of the user and an identification code of the conference record in the viewing request are analyzed;

s208, reading the identification code of the conference record, and judging whether the identification code of the conference record meets the preset requirement;

s209, if the identification code of the meeting record meets the preset requirement, selecting a corresponding template from a preset template library according to the template selection signal, and calling a meeting record source file corresponding to the meeting record identification code from the meeting record database to be led into the template so as to enable the user to check the meeting record source file.

In specific implementation, when a viewing request of a user for the meeting record is obtained, it means that the user needs to view the meeting record, in the viewing request, the user selects a format of the meeting record (for example, a format of PPT, word, EXCEL, or the like), and the conference system analyzes the template selection signal of the user in the viewing request and an identification code of the meeting record that the user wants to view.

It is understood that the identification code of the meeting record is stored in the meeting record database, and when the meeting record source file is stored in the meeting record database, a plurality of identification codes are formed for the identification header of each meeting record source file, and different identification codes correspond to different levels, for example: A. b and C identification codes, wherein the A identification code can be consulted by all personnel, the B identification code can be consulted by middle-layer personnel, and the C identification code can be consulted by high-layer personnel.

The method comprises the steps that an identification code of a conference record which a user wants to look up by the user is stored in a look-up request of the user, whether the user has a look-up authority is judged according to the identification code, if the user has the look-up authority, the conference record is led in through a template selected by the user, and then the corresponding conference record is led out.

In summary, compared with the first embodiment, the automatic conference recording method in the embodiments of the present invention binds users in a conference with terminal devices, so that subsequent conference recording information can be distinguished according to the terminal devices; by setting the identification codes, the consulting authority of different project meetings is ensured, and data leakage is avoided; by setting the template database, the conference records can be exported in a format selected by a user, and the overall practicability is further improved.

EXAMPLE III

In another aspect, referring to fig. 6, an automatic conference recording system according to a third embodiment of the present invention is further provided, where the system includes:

the matching module 11 is configured to acquire key information at each time point on a timeline of a conference, and perform terminal matching on each piece of key information through a preset terminal configuration file, so that each piece of key information corresponds to each piece of terminal equipment in the conference;

specifically, the matching module 11 includes:

the first processing unit is used for preprocessing each voice segment and constructing key information according to the preprocessed voice segments.

Further, the first processing unit is specifically configured to:

In some optional embodiments, the matching module 11 further comprises:

the second acquisition unit is used for acquiring action data input by the user terminal in the conference in real time at each time point on the time line of the conference;

A conversion module 12, configured to perform text conversion on each piece of key information and device information of a terminal device corresponding to the key information to obtain corresponding text information, where the text information includes a timeline identifier of the conference;

the recognition module 13 is configured to perform semantic recognition on each piece of text information in sequence, and generate topic information and a conference record source file of the conference according to a preset keyword extraction rule and a preset text arrangement rule;

and the configuration module 14 is configured to configure the subject information and the timeline identifier of the conference as a recognition header of the conference record source file, and store the configured conference record source file in a conference record database.

In some optional embodiments, the system further comprises:

The functions or operation steps of the modules and units when executed are substantially the same as those of the method embodiments, and are not described herein again.

The automatic conference recording system provided by the embodiment of the present invention has the same implementation principle and technical effect as the foregoing method embodiment, and for brief description, no mention is made in the system embodiment, and reference may be made to the corresponding contents in the foregoing method embodiment.

Example four

Referring to fig. 7, a computer device according to a fourth embodiment of the present invention is shown, and includes a memory 10, a processor 20, and a computer program 30 stored in the memory 10 and executable on the processor 20, where the processor 20 implements the automatic conference recording method when executing the computer program 30.

The memory 10 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The memory 10 may in some embodiments be an internal storage unit of the computer device, for example a hard disk of the computer device. The memory 10 may also be an external storage device in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory 10 may also include both an internal storage unit and an external storage device of the computer apparatus. The memory 10 may be used not only to store application software installed in the computer device and various kinds of data, but also to temporarily store data that has been output or will be output.

In some embodiments, the processor 20 may be an Electronic Control Unit (ECU), a Central Processing Unit (CPU), a controller, a microcontroller, a microprocessor or other data Processing chips, and is configured to run program codes stored in the memory 10 or process data, for example, execute an access restriction program.

It should be noted that the configuration shown in fig. 7 does not constitute a limitation of the computer device, and in other embodiments the computer device may include fewer or more components than those shown, or some components may be combined, or a different arrangement of components.

An embodiment of the present invention further provides a readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the automatic conference recording method as described above.

Those of skill in the art will understand that the logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be viewed as implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent application shall be subject to the appended claims.

Claims

1. An automatic recording method for a conference, the method comprising:

acquiring key information of each time point on a time line of a conference, and performing terminal matching on each key information through a preset terminal configuration file so as to enable each key information to correspond to each terminal device in the conference;

wherein the step of acquiring key information at each time point on a timeline of the conference further comprises:

comparing the action data through a preset action database to obtain a plurality of action frames of the action data, and correspondingly converting each action frame into a plurality of action frame images;

performing edge extraction and feature extraction on the action frame in each action frame image according to an image encoder in a neural network model which is completed with learning so as to obtain an image boundary of the action frame in each action frame image and pixel data of each action frame image;

converting the image boundary of the action frame in each action frame image and the pixel data of each action frame image into corresponding key information by combining standard pixel data and the corresponding relation between the corresponding standard action frame image and the character information;

configuring the theme information and the time line identification of the conference as an identification title of the conference record source file, and storing the configured conference record source file in a conference record database;

when the conference record source file is stored in the conference record database, a plurality of identification codes are formed for the identification title of each conference record source file, and different grades and different authorities corresponding to different identification codes;

reading the identification code of the conference record, and judging whether the identification code of the conference record has a viewing authority or not;

and if the identification code of the conference record has the viewing permission, selecting a corresponding template from a preset template library according to the template selection signal, and calling a conference record source file corresponding to the conference record identification code from the conference record database to be led into the template so as to enable the user to view the conference record.

2. The automatic recording method for a conference as claimed in claim 1, wherein said step of acquiring key information at each time point on a timeline of a conference is preceded by the steps of:

3. The automatic recording method for a conference according to claim 1, wherein said step of acquiring key information at each time point on a timeline of a conference comprises:

and preprocessing each voice segment, and constructing key information according to the preprocessed voice segments.

4. The automatic conference recording method according to claim 3, wherein the step of preprocessing each of the speech segments and constructing key information according to the preprocessed speech segments comprises:

5. An automatic conference recording system, the system comprising:

the conversion module is used for performing text conversion on each piece of key information and the equipment information of the corresponding terminal equipment to obtain corresponding text information, and the text information has a time line identifier of the conference;

the recognition module is used for carrying out semantic recognition on each text message in sequence and generating topic information and a conference record source file of the conference according to a preset keyword extraction rule and a text arrangement rule;

the configuration module is used for configuring the theme information and the time line identifier of the conference into an identification title of the conference record source file and storing the configured conference record source file into a conference record database;

wherein the matching module further comprises:

the second decomposition unit is used for comparing the action data through a preset action database to obtain a plurality of action frames of the action data and correspondingly converting each action frame into a plurality of action frame images;

the characteristic detection unit is used for carrying out edge extraction and characteristic extraction on the action frames in the action frame images according to an image encoder in the neural network model which is completed with learning so as to obtain the image boundaries of the action frames in the action frame images and the pixel point data of the action frame images;

the second processing unit is used for converting the image boundary of the action frame in each action frame image and the pixel data of each action frame image into corresponding key information by combining the standard pixel data and the corresponding relation between the corresponding standard action frame image and the character information;

an analysis module, configured to form multiple identification codes for the identification header of each conference record source file when the conference record source file is stored in the conference record database, where the different identification codes correspond to different levels and different permissions,

the judging module is used for reading the identification code of the conference record and judging whether the identification code of the conference record has the viewing permission;

and the calling module is used for selecting a corresponding template from a preset template library according to the template selection signal if the identification code of the conference record has the viewing permission, and calling a conference record source file corresponding to the conference record identification code from the conference record database to be led into the template so as to enable the user to view the conference record source file.

6. The automatic recording system for conferences of claim 5, wherein the system further comprises:

the analysis module is used for analyzing the template selection signal of the user and the identification code of the conference record in the viewing request when the viewing request of the user for the conference record is obtained;

7. A readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the automatic recording method for a conference according to any one of claims 1 to 4.

8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the automatic conference recording method according to any one of claims 1 to 4 when executing the computer program.