CN109243484A

CN109243484A - A kind of generation method and relevant apparatus of conference speech record

Info

Publication number: CN109243484A
Application number: CN201811202730.4A
Authority: CN
Inventors: 徐炜; 刘丽
Original assignee: Shanghai Mxchip Information Technology Co Ltd
Current assignee: Shanghai Mxchip Information Technology Co Ltd
Priority date: 2018-10-16
Filing date: 2018-10-16
Publication date: 2019-01-18

Abstract

The generation method of a kind of conference speech record provided herein, comprising: receive the VoP and equipment identities data of conference speech；Voice is carried out according to VoP to convert to obtain text data；Minutes are generated according to equipment identities data and obtain code, obtain text data to obtain code according to minutes.The application is not necessarily to the manually recorded minutes of participant, only code need to be obtained using minutes can be obtained the text data of minutes, the minutes burden for significantly reducing participant, improves effect of meeting, splendid meeting experience sense is provided for participant.The application also provides a kind of generation system that conference speech records and a kind of computer readable storage medium, has above-mentioned beneficial effect.

Description

A kind of generation method and relevant apparatus of conference speech record

Technical field

This application involves internet of things field, in particular to a kind of the generation method and relevant apparatus of conference speech record.

Background technique

For convention, meeting sponsor/meeting company is frequently necessary to hold meeting in hotel.In this scene Under, the spectators that attend a meeting are very more, such as professional person, news media reporter.The spectators that attend a meeting usually require to carry out conference content manually Record, so as to can postedit use.

The method of this pure hand-kept is cumbersome, and efficiency is very low, on the one hand since convention scene place is larger, Spectator attendance is more, different seats, and angle can all seriously affect the record effect for the spectators that attend a meeting；On the other hand since meeting is discussed Journey is compact, it is easy to fail to record, the generation for waiting fortuitous events is listened in leakage, is easily reduced the participant experience for the spectators that attend a meeting, is reduced it and attend a meeting Actual effect, but also desired meeting expected purpose is not achieved in the meeting side of holding.

Summary of the invention

The purpose of the application is to provide the generation method and relevant apparatus of a kind of conference speech record, solves existing meeting Scene can only respectively record the problem that experience sense is low, efficiency is lower of attending a meeting brought by conference content by participant.

In order to solve the above technical problems, the application provides a kind of generation method of conference speech record, specific technical solution It is as follows:

Receive the VoP and equipment identities data of conference speech；

Voice is carried out according to the VoP to convert to obtain text data；

Minutes are generated according to the equipment identities data and obtain code, are obtained to obtain code according to the minutes The text data.

Wherein, before the VoP and the equipment identities data that receive conference speech, further includes:

The recording of conference speech is acquired, and the VoP is generated according to the recording.

Wherein, the recording of conference speech is acquired, and the VoP is generated according to the recording and includes:

The recording for acquiring conference speech obtains the recording and generates corresponding voice data；

Judge in the voice data with the presence or absence of the blank more than preset duration；

If so, the voice data is divided into voice data segment using the blank as cut-point；The voice Data packet includes all voice data segments.

Wherein, it carries out voice according to the VoP and converts to obtain text data to specifically include:

Voice conversion is carried out to each voice data segment, obtains corresponding segment text data；

The segment text data is spliced into the text data.

Wherein, the segment text data is spliced into the text data includes:

Spliced in chronological order according to the corresponding timestamp of each voice data segment, generates the textual data According to.

Wherein, when there are invalid voice data slot or when damaging voice data segment, further includes:

It determines the invalid voice data slot or described damages the corresponding timestamp of voice data segment；

The determining secondary voice data segment with the timestamp same time；

The secondary voice data segment is replaced into the invalid voice data slot or described damages voice data segment；Its In, the pair voice data segment is from invalid voice data slot or the voice data segment that damages by different sound pick-up outfits It collects.

Wherein, the minutes code is two dimensional code.

The application also provides a kind of generation system of conference speech record, comprising:

Data sink, for receiving the VoP and equipment identities data of conference speech；

Recording reforming unit converts to obtain text data for carrying out voice according to the VoP；

Diostribution device is recorded, code is obtained for generating minutes according to the equipment identities data, so as to according to Minutes obtain code and obtain the text data.

Wherein, the data sink and the recording reforming unit are the device in cloud server.

Wherein, further includes:

Sound pick-up outfit is connected with the data sink, for acquiring the recording of conference speech, and according to the recording Generate the VoP.

Wherein, the sound pick-up outfit is used to be connected to network after the network connection instruction for receiving mobile terminal, and leads to It crosses the network and the VoP is sent to the data sink.

Wherein, when the sound pick-up outfit and network connection interruption, the network connection unit in the sound pick-up outfit is used for Auto-reconnect jumps automatically.

The application also provides a kind of computer readable storage medium, is stored thereon with computer program, the computer journey The step of generation method as described above is realized when sequence is executed by processor.

The generation method of a kind of conference speech record provided herein, comprising: receive the voice data of conference speech Packet and equipment identities data；Voice is carried out according to the VoP to convert to obtain text data；According to the equipment identities Data generate minutes and obtain code, obtain the text data to obtain code according to the minutes.

The application first obtains the VoP and corresponding equipment identities data that conference speech records, to voice data After packet carries out language and characters converting text data, the mapping relations of text data and equipment identities data are realized.Again by equipment body Part data generate minutes and obtain code, and user can generate acquisition code according to the meeting and obtain the corresponding text of the VoP Notebook data.A kind of generation method of conference speech record provided by the present application, is not necessarily to the manually recorded minutes of participant, only Code need to be obtained using minutes can be obtained the text data of minutes, and the minutes for significantly reducing participant are negative Load, improves effect of meeting, splendid meeting experience sense is provided for participant.The application also provides a kind of conference speech note The generation system of record and a kind of computer readable storage medium have above-mentioned beneficial effect, and details are not described herein again.

Detailed description of the invention

In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.

Fig. 1 is a kind of flow chart of the generation method of conference speech record provided by the embodiment of the present application；

Fig. 2 is a kind of flow chart for the method for generating VoP provided by the embodiment of the present application；

Fig. 3 is the flow chart for the method that another kind provided by the embodiment of the present application generates VoP；

Fig. 4 is a kind of generation system structure diagram of conference speech record provided by the embodiment of the present application；

Fig. 5 is the generation system structure diagram of the record of another kind conference speech provided by the embodiment of the present application；

Fig. 6 is the first structural representation of the generation system of the record of another kind conference speech provided by the embodiment of the present application Figure；

Fig. 7 is second of structural representation of the generation system of the record of another kind conference speech provided by the embodiment of the present application Figure；

Fig. 8 is the third structural representation of the generation system of the record of another kind conference speech provided by the embodiment of the present application Figure；

Fig. 9 is the first structural representation of the generation system of the record of another conference speech provided by the embodiment of the present application Figure；

Figure 10 is that second of structure of the generation system of the record of another conference speech provided by the embodiment of the present application is shown It is intended to.

Specific embodiment

To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall in the protection scope of this application.

Referring to FIG. 1, Fig. 1 is a kind of process of the generation method of conference speech record provided by the embodiment of the present application Figure, the generation method include:

S101: the VoP and equipment identities data of conference speech are received；

This step is intended to obtain the VoP of conference speech and the equipment identities data of the VoP.Equipment body Part both data and VoP correspond.

It is understood that there is the recording of acquisition conference speech before this step default, and voice is generated according to recording The process of data packet.And for convention scene, usually there are multiple sound pick-up outfits, with guarantee comprehensive no dead angle to meeting It records at view scene.Especially around the spokesman of meeting scene and around other personnels participating in the meeting, this point is to be based on Spokesman and pay attention to the class between people the dialog procedures such as the enquirement that may occur.Therefore, the VoP received should include corresponding Equipment identities data.In other words, each sound pick-up outfit at meeting scene should all be equipped with unique equipment identities data, This is not construed as limiting the particular content of equipment identities data, it is to be understood that it, which should be to have, distinguishes different sound pick-up outfits Function data, such as the equipment identities data can be for EIC equipment identification code etc..Particularly, if the sound pick-up outfit has networking Function and networked (including local area network), equipment identities data can for IP address or other with unique identification function Data etc..Certainly, equipment identities data can also include a variety of data with unique identification function, such as simultaneously include equipment Identification code and IP address.

It should be noted that this step is usually held by the equipment with language and characters conversion function, terminal, processing unit Row, particularly, including cloud server.

Voice data form in VoP is not construed as limiting at this.Although it is understood that conference speech It is complete, but the conference speech not necessarily corresponds to a VoP.In order to keep subsequent speech recognition conversion written The process of word is more efficient, and also for safety of the raising VoP in transmission process, VoP may include Multiple voice data segments.In other words, during recording, it can be generated voice data segment, and all voice numbers According to segment according to the corresponding recording file of the as entire conference speech of time sequencing splicing.

It is not construed as limiting herein for how to divide voice data segment.For example, it can be set to divide duration i.e. every 10s or 20s or other times generate a voice data segment；It can also be divided according to the dead time of spokesman.

It is not construed as limiting herein for how to receive VoP, in other words, for how to be connected with sound pick-up outfit It is not construed as limiting.It can be wired connection, or be wirelessly connected, radio connection includes Wi-Fi, GPRS etc., Huo Zheqi He supports the communication mode of audio signal transmission.

It should be added that the conference speech in this step records, does not represent and be applicable only to meeting room, but Occasion and place suitable for all recordable speech contents, in other words, conference speech is regarded as all speech content (examples Such as phone speech, video presentations form), place may include meeting room, classroom, office etc..

S102: voice is carried out according to the VoP and converts to obtain text data；

This step is intended to carry out voice conversion to the VoP got in S101, and obtains corresponding textual data According to.

Specific voice conversion process is not construed as limiting herein.It is understood that in conference process, unavoidably Will appear interruption, noise, noise special circumstances.Therefore in voice conversion, need to consider the relevant treatments such as noise reduction, to reduce The influence of text is converted to speech recognition.Certainly, if certain section of content existed in some VoP really can not be translated Or there is translation messy code, then it can use the VoP of other sound pick-up outfits at same meeting scene at this time in the section Appearance is translated.It can be seen that can use multiple sound pick-up outfits to improve the generation serious forgiveness of conference speech record.And equipment Identity data can be used for distinguishing the corresponding source device of each section text data content in voice conversion process.Equipment body at this time Part data can be also used for detecting whether the corresponding sound pick-up outfit of the VoP breaks down.

It should be noted that obtained text data needs to be associated with the equipment identities data in step S101.

If VoP includes several voice data segments as described in S101, after carrying out voice conversion, obtain The corresponding segment text data of each voice data segment, finally needs to splice each segment text data, generate Whole text data.Certainly, so it is easy to understand that temporally suitable according to the corresponding timestamp of each voice data segment when splicing Sequence is spliced.

S103: generating minutes according to the equipment identities data and obtain code, to be obtained according to the minutes Code obtains the text data.

It should be noted that at least needing to include equipment identities data in the content of minutes acquisition code.Equipment identities One important function of data is to distinguish meeting place.In some places including multiple meeting rooms, such as hotel, in meeting After view, clear conference room name is needed when obtaining minutes, and the sound pick-up outfit of each meeting room a series of should be compiled Code.

Such as when there are tri- meeting rooms of A, B, C, each meeting room respectively includes three sound pick-up outfits, i.e. A1, A2, A3, B1, B2, B3, C1, C2, C3, want to know at this time the people of A meeting room conference content need to only obtain A1, A2 or A3 any one or It is multiple, if there is apparent error in the corresponding portions of text data content of sound pick-up outfit A1, then it is available at this time via Sound pick-up outfit A2 or sound pick-up outfit A3 carries out the text data that voice converts.

In addition to equipment identities data, the form of code is obtained for minutes herein and content therein is not construed as limiting.

In form, which, which obtains code, can provide the carrier information of additional information to be any, including but not limited to Two dimensional code, website links.In content, by taking two dimensional code as an example, the content after scanning the two-dimensional code can be corresponding for text data file Download link, wechat public platform, payment link (payment obtain text data), APP download link or other web page interlinkages etc.. By taking APP as an example, login account can be set, which can sponsor direction participant by meeting in advance and provide, so that The user side for only holding account can obtain the text data of minutes.Further, forwarding function can also be set in APP Can, allow user that text data is forwarded to mailbox etc..

During entire S101 to S103, including following association: equipment identities Data-Voice data, equipment Identity data-text data, minutes obtain code-text data, by between initial equipment identities data and voice data Association obtains code to final minutes and is associated with text data, and ensure that will not occur in the generating process of minutes The entanglement of the voice data of different sound pick-up outfits guarantees that participant may finally obtain code according to minutes and get its institute The corresponding text data of target sound pick-up outfit in the even target meeting room of the target meeting room needed.

It should be noted that between S103 and S101, S102 and set execution ordinal relation is not present.More accurately It says, it is not necessary to minutes could be generated after receiving VoP and obtain code, but can directly basis be set Standby identity data generates minutes and obtains code, and can sponsor hair by meeting and minutes are obtained code before a conference begins It is distributed to participant.

But must then VoP received and generate textual data by obtaining code acquisition text data according to minutes It can just be carried out according to rear.Preferably, when VoP is made of several voice data segments, participant can be according to meeting Record obtains code and obtains conference speech content in real time, i.e., after every progress voice conversion, can obtain code using minutes and show Text data.

The present embodiment first obtains the VoP and corresponding equipment identities data that conference speech records, to voice number After carrying out language and characters converting text data according to packet, the mapping relations of text data and equipment identities data are realized.Again by equipment Identity data generates minutes and obtains code, and it is corresponding that user can obtain the VoP according to meeting generation acquisition code Text data.The process is not necessarily to the manually recorded minutes of participant, and only need to obtain code using minutes can be obtained meeting The text data for discussing record significantly reduces the minutes burden of participant, improves effect of meeting, mention for participant Splendid meeting experience sense is supplied.

Based on the above embodiment, as preferred embodiment, the present embodiment provides a kind of acquisition of pre-selection only for S101 The recording of conference speech and the method for generating VoP, referring to fig. 2, Fig. 2 are a kind of life provided by the embodiment of the present application At the flow chart of the method for VoP, concrete scheme is as follows:

S201: acquiring the recording of conference speech, obtains recording and generates corresponding voice data；

S202: judge in voice data with the presence or absence of the blank more than preset duration；If so, into S203；

S203: voice data is then divided into voice data segment using blank as cut-point；VoP includes institute There is voice data segment.

It is especially noted that including voice data, VoP and voice data segment three in the present embodiment Concept, the recording file obtained when voice data refers to recording to conference speech, and VoP or voice data Segment is voice data form that may be present.If not making specified otherwise hereafter, voice data is referred to VoP Content is identical.

In the present embodiment when acquiring voice data, the blank that whether there is certain time in voice data is judged, here Blank refer to the pause that spokesman speaks.Here preset duration is not construed as limiting, such as can is 0.5s etc., specifically Preset duration can be made corresponding setting by those skilled in the art, be not limited thereto.Specifically, judging whether spokesman stops Common method is to judge decibel size, and decibel when pause is always less than decibel when speech.Certainly it can also utilize Other methods judge the presence of blank, different one illustrate herein.

It when being if it exists more than the blank of preset duration, can be split using the blank as cut-point, voice data is drawn It is divided into voice data segment.It can guarantee that being not in one is completely present in two voice data segments, makes it in this way Influence the process that the speech recognition in subsequent process is converted to text data.

Certainly, if it does not exist it is more than the blank of preset duration, illustrates in lasting speech, at this time naturally without carrying out voice number According to the division of segment, continue to generate voice data segment.

Further, on the basis of the present embodiment, preset duration may include multiple judgment criterias in step S202, then Fig. 3 may refer to the optimization process of S201 to S203, Fig. 3 is that another kind provided by the embodiment of the present application generates voice data The flow chart of the method for packet, detailed process are as follows:

S301: acquiring the recording of conference speech, obtains recording and generates corresponding voice data；

S302: judge in voice data with the presence or absence of blank；If so, into S303；

S303: judge whether blank is greater than the first preset duration and less than the second preset duration；If so, into S304；If It is no, into S305；

S304: voice data is divided into voice data segment using blank as cut-point；VoP includes all Voice data segment.

S305: judge whether blank is greater than the second preset duration；If so, into S306；

S306: blank is deleted, and voice data is divided into voice data segment using blank as cut-point.

The first preset duration and the second preset duration are not construed as limiting herein.It should be noted that setting first is default The purpose of duration be to judge speech during normal hesitations, and be arranged the second preset duration purpose be judgement made a speech Pause in journey with the presence or absence of prolonged, such as the upper end of spokesman at this time, in meeting the colloquy of participant or its His event is generating VoP so that having no the conference speech of actual needs record in the short time (often a few minutes) When (i.e. the voice data segment of the present embodiment), this section of blank can be deleted.

If such as blank is more than that (i.e. the first preset duration may be considered for 0.5s) normally stops 0.5s during making a speech , the end of one section of speech process is considered more than 10s (i.e. the second preset duration is 10s) at this time, then is deleted this section of blank.

Further, in summary two be directed to step S101 preferred embodiment, provide herein another generate language The method of sound data slot, detailed process are as follows:

S401: acquiring the recording of conference speech, obtains recording and generates corresponding voice data；

S402: judge in voice data with the presence or absence of blank；

S403: judge whether blank is greater than third preset duration and less than the 4th preset duration；If so, into S404；If It is no, into S407；

S404: using blank as cut-point to be determined, judge that the duration of the cut-point to be determined away from practical cut-point is It is no within the scope of preset duration；If so, into S405；If it is not, into S406；

S405: using the cut-point to be determined as next practical cut-point；

S406: cancelling cut-point to be determined, returns to S401；

S407: judge whether blank is greater than the 4th preset duration；

S408；Blank is deleted, and using blank as practical cut-point；

S409: after all practical cut-points in voice data are identified, voice data is divided by practical cut-point It is cut into voice data segment.

The present embodiment is intended to control all voice data segments within the scope of the 5th preset duration, in the present embodiment Third preset duration is identical with the first preset duration practical significance in a upper embodiment, the 4th preset duration and a upper embodiment In the second preset duration practical significance it is identical.Preset duration range is not construed as limiting herein, such as can (50 ± 10s) Deng.During being actually divided into voice data segment, in order to avoid there is too short voice data segment, such as occur The too short content of " everybody looks at " this duration is as a voice data segment, in cutting procedure, even if detecting When pause in speech, it is first used as cut-point to be determined, and judge this cut-point to be determined from a upper practical cut-point Whether fiveth preset duration is met.It is illustrated with this section to each noun and carries out comprehensive explanation, if there are one before " everybody looks at " A practical cut-point should then have a cut-point to be determined after " everybody looks at ", and this cut-point to be determined is apart from upper one A practical cut-point is unsatisfactory for one minute also with regard to the duration of 1s or so, then cancels this cut-point to be determined, and continue to judge Next blank.It should be noted that preset duration range here refers to a value range, with third preset duration, Four preset durations are usually exact value difference, it is therefore an objective to ensure that the duration of each voice data segment is almost identical.

Based on it is above-mentioned include each embodiment that voice data is divided into voice data segment, as preferred embodiment, When there are invalid voice data slot or when damaging voice data segment, further includes:

It determines invalid voice data slot or damages the corresponding timestamp of voice data segment；

The determining secondary voice data segment with timestamp same time；

Secondary voice data segment is replaced into invalid voice data slot or damages voice data segment；Wherein, secondary voice number According to segment and invalid voice data slot or damages voice data segment and collected by different sound pick-up outfits.

The present embodiment is intended to illustrate, when there are invalid voice data slices in the corresponding VoP of some sound pick-up outfit Section or when damaging voice data segment, can use the pair of the corresponding timestamp in the corresponding VoP of other sound pick-up outfits Voice data segment is replaced, to guarantee the integrality and accuracy of text data that voice converts.Here invalid Voice data segment refers to that influencing voice conversion makes it be unable to get the language of normal text data with voice data segment is damaged Sound data slot, such as voice data segment (including but not limited to format error, code error etc. of voice conversion can not be carried out Reason), (including but not limited to messy code or sentence are obviously obstructed after voice conversion for the voice data segment of voice conversion apparent error It is suitable, do not meet Chinese expression frame mode etc.).

Unlike replacement VoP in step S103, the present embodiment only need to carry out voice data according to timestamp The replacement of segment substantially increases failure problems in voice conversion process without being replaced to complete VoP Treatment effeciency provides more preferably participant experience for participant.

Especially, it should be noted that above-mentioned first generate complete voice data respectively in the embodiment of step S101, And then carry out the segmentation of voice data segment.And in the actual process, it can also be with reference to the method for the segmentation of the various embodiments described above It is split immediately after generating voice data, it is not necessary to until the generation of complete VoP, that is, realize and speech process It is synchronous, improve the formation efficiency of conference speech record.

Based on the thought that voice data is divided into voice data segment above, can also there are other division modes, example If every two sections of voice data segments are there may be partly overlapping, each voice data segment corresponds to text after the conversion of raising voice in this way Linking integrality between data.

A kind of generation system of conference speech record provided by the embodiments of the present application is introduced below, it is described below Generation system can correspond to each other reference with above-described generation method.

Referring to fig. 4, Fig. 4 is a kind of generation system structure signal of conference speech record provided by the embodiment of the present application Figure, the application also provide a kind of generation system of conference speech record, comprising:

Data sink 100, for receiving the VoP and equipment identities data of conference speech；

Recording reforming unit 200 converts to obtain text data for carrying out voice according to the VoP；

Diostribution device 300 is recorded, code is obtained for generating minutes according to the equipment identities data, so as to according to institute It states minutes and obtains the code acquisition text data.

Referring to the generation system structure signal that Fig. 5, Fig. 5 are the record of another kind conference speech provided by the embodiment of the present application Figure, it is to be understood that generating system at this time can also include:

Sound pick-up outfit 400 is connected with the data sink 100, for acquiring the recording of conference speech, and according to institute It states recording and generates the VoP.

Data sink 100 refers to the module or unit with data receiver function, herein for its specific number It is not construed as limiting according to reception mode, it can be with wire transmission, or wireless transmission (including but not limited to bluetooth module, infrared mould Block, Zigbee module, GPRS etc.).Data sink 100 is connected with sound pick-up outfit 400 and recording reforming unit 200 respectively, The voice data of sound pick-up outfit 400 is sent to recording reforming unit 200, the concrete form about voice data can refer to above The description of each embodiment in a kind of generation method of conference speech record of description, therefore not to repeat here.

Recording reforming unit 200 receives data sink and is transmitted through the voice data come, and carries out voice to voice data Conversion, generates corresponding text data.Preferably, recording reforming unit can also include memory, be used for voice data With the text data of generation.

The practical function of record diostribution device 300 is to generate minutes to obtain code.Certainly, it needs and recording conversion dress It sets 200 to be connected, it will it is associated with text data that view record obtains code.It should be noted that data sink 100 and recording Reforming unit 200 can be in local, or remote server, such as cloud server, then 100 He of data sink Recording reforming unit 200 is the device in cloud server.The voice for receiving conference speech is externally realized by cloud server Data packet and equipment identities data convert to obtain text data according to VoP progress voice.

Sound pick-up outfit 400 generally includes microphone, it is, of course, understood that sound pick-up outfit 400 needs and data receiver Device 100 is connected, therefore further includes corresponding communication component or communications cable etc..It is used in addition, can be also equipped on sound pick-up outfit 400 It avoids influencing meeting for reminding sound pick-up outfit 400 to break down in time in warning components such as the signal lamps of display recording state The generation of record.

When data sink 100 and recording reforming unit 200 are when local, specific manifestation form does not limit herein It is fixed, usually there is communication function and the single-chip microcontroller of voice transformation function etc..At this point it is possible to which reception device 100 and recording are filled It sets 200 and is packaged into the first integrated apparatus 10, then entirely generating system includes the first integrated apparatus 10 and record diostribution device 300。

So far, data sink 100, recording reforming unit 200, record diostribution device 300 and recording are set in the application There are several types of connection relationships between standby 400:

A, sound pick-up outfit 400, the first integrated apparatus 10 and record diostribution device 300；

Referring to the first for the generation system that Fig. 6, Fig. 6 are the record of another kind conference speech provided by the embodiment of the present application Structural schematic diagram, at this point, sound pick-up outfit 400 can be set to collect under the place of conference speech, such as speech platform, or Near microphone etc., it is not specifically limited herein.And integrated apparatus 10 can be placed in meeting room any position, it is preferred that can be with It is placed in inside wall or floor is inferior.Record diostribution device 300 obtains code for providing minutes, and specifically used form can be with By meeting, sponsor is set, such as record diostribution device 300 can be set to meeting chamber inlet in the form of fixed terminal Place is provided in the form of two dimensional code etc..

B, sound pick-up outfit 400, the second integrated apparatus 20；Second integrated apparatus 20 includes data sink 100, record Mixer 200 and record diostribution device 300；

Referring to second of the generation system that Fig. 7, Fig. 7 are the record of another kind conference speech provided by the embodiment of the present application Structural schematic diagram, hair is sponsored in meeting at this time need to only obtain the meeting that the second integrated apparatus 20 (middle record diostribution device 300) generates View record obtains code and is supplied to participant.

It should be noted that the second integrated apparatus 20 is located locally at this time.In order to improve the text for minutes The protection of data, and this article notebook data is further utilized, the second integrated apparatus 20 can be with cloud server at this time Connection, (can also upload voice data and/or equipment identities number for text data to be at least uploaded to cloud server certainly According to), it is equivalent to and text data is uploaded to cloud.

C, sound pick-up outfit 400, cloud server, record diostribution device 300；

Referring to the third for the generation system that Fig. 8, Fig. 8 are the record of another kind conference speech provided by the embodiment of the present application Structural schematic diagram, data sink 100 and recording device 200 are a part of cloud server at this time, sound pick-up outfit 400, Record diostribution device 300 is connect with cloud server, realizes local recording-cloud processing-process locally distributed.

Further, according to cloud server, sound pick-up outfit 400 can also upload the equipment running status number of itself According to realization monitors the cloud of sound pick-up outfit 400, guarantees that the generation of minutes is unaffected.

Above to a kind of several preferred embodiments of the generation system of conference speech record provided in this embodiment into Row explanation.Further, it should be noted that need to consider to do corresponding format when to any data being uploaded to cloud server Conversion, to meet cloud protocol requirement.And when there is data transmission between each device, corresponding protocol encapsulation is if desired done, Then there should be the components such as processor or chip in the related device before data transmission, be not limited thereto.

Based on the above embodiment, as preferred embodiment, which can also access mobile terminal 500.Referring to Fig. 9, Fig. 9 are the first structural schematic diagram of the generation system of the record of another conference speech provided by the embodiment of the present application, Specifically, sound pick-up outfit 400 can be used for after the network connection instruction for receiving mobile terminal 500 being connected to network, and lead to It crosses network and VoP is sent to data sink 100.

The present embodiment is intended to be wirelessly connected when sound pick-up outfit uses, such as when GPRS, can use 500 pairs of mobile terminal records Sound equipment 400 carries out network configuration, and sound pick-up outfit 400 is made to access network.

Further, when sound pick-up outfit 400 and network connection interruption, the network connection unit in sound pick-up outfit 400 is used In Auto-reconnect or automatically it jumps.

When 400 Radio Access Network of sound pick-up outfit, if because wireless network it is unstable caused by sound pick-up outfit 400 voice number According to packet transmission difficulty or failure is sent, Auto-reconnect, or automatic jump are realized by the network connection unit in sound pick-up outfit 400 at this time Go to another wireless network.

It based on the above embodiment, is the record of another conference speech provided by the embodiment of the present application referring to Figure 10, Figure 10 Generation system second of structural schematic diagram.Mobile terminal 500 can also with only with record diostribution device 300 be connected, by remembering Minutes acquisition code is sent to mobile terminal 500 by record diostribution device 300, then mobile terminal 500 can be directly according to the meeting View record obtains the text data that code obtains minutes.

Further, when accessing mobile terminal 500, it is a two dimensional code that minutes, which can be set, and obtain code, by The two-dimensional code scanning downloads a dedicated APP of meeting, user can by log in the APP realize to the acquisition of text data, this Ground storage or forwarding.Certainly, the dedicated APP of the meeting sponsors hair by meeting and provides, and account can be arranged for the dedicated APP of the meeting, And account ID and password are provided for participant, the confidentiality of conference content can be improved to a certain extent in this way.Based on shifting Concrete application in dynamic terminal 500 is realized different citing description herein and is limited.

Based on a upper embodiment, as preferred embodiment, the sound pick-up outfit 400 may include:

Acquisition unit obtains the recording and generates corresponding voice data for acquiring the recording of conference speech；

Judging unit, for judging in the voice data with the presence or absence of the blank more than preset duration；

Data package generating unit will be described using the blank as cut-point when being judged as YES for the judging unit Voice data is divided into voice data segment；The VoP includes all voice data segments.

By above description, it is not difficult to find out that, judging unit is usually blank sound decision circuitry, and is existed in sound pick-up outfit 400 Processor including data package generating unit, to execute step corresponding to data package generating unit.

It, should when sound pick-up outfit 400 includes data package generating unit as preferred embodiment based on a upper embodiment Recording reforming unit 200 may include:

Conversion unit obtains corresponding segment textual data for carrying out voice conversion to each voice data segment According to；

Concatenation unit, for the segment text data to be spliced into the text data.

Further, the concatenation unit be specifically as follows for according to the corresponding timestamp of each voice data segment by Time sequencing is spliced, and the unit of the text data is generated.

Based on a upper embodiment, as preferred embodiment, when sound pick-up outfit 400 includes data package generating unit, and language In sound data slot there are invalid voice data slot or when damaging voice data segment, the recording reforming unit 200 may be used also To include:

Timestamp determination unit, for determining that the invalid voice data slot or the voice data segment that damages correspond to The timestamp；

Segment determination unit is replaced, for the determining secondary voice data segment with the timestamp same time；

Replacement unit, for the secondary voice data segment to be replaced the invalid voice data slot or described damages language Sound data slot；Wherein, the secondary voice data segment and invalid voice data slot or it is described damage voice data segment by Different sound pick-up outfits collect.It is understood that in the generation system of above-mentioned each conference speech record, each equipment or list Between member can independent of manual command's automatic running, therefore entirely generate system starting after, conference speech can be realized Automatically generating for record, the personnel for needing conference speech content are supplied to via mobile terminal 500 or other forms, are brought greatly Convenience.

Present invention also provides a kind of computer readable storage mediums, have computer program thereon, the computer program It is performed the step of a kind of generation method that conference speech records provided by above-described embodiment may be implemented.The storage medium It may include: USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), the various media that can store program code such as magnetic or disk.

Each embodiment is described in a progressive manner in specification, the highlights of each of the examples are with other realities The difference of example is applied, the same or similar parts in each embodiment may refer to each other.For embodiment provide system and Speech, since it is corresponding with the method that embodiment provides, so being described relatively simple, related place is referring to method part illustration ?.

Specific examples are used herein to illustrate the principle and implementation manner of the present application, and above embodiments are said It is bright to be merely used to help understand the present processes and its core concept.It should be pointed out that for the ordinary skill of the art For personnel, under the premise of not departing from the application principle, can also to the application, some improvement and modification can also be carried out, these improvement It is also fallen into the protection scope of the claim of this application with modification.

It should also be noted that, in the present specification, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", is not arranged Except there is also other identical elements in the process, method, article or apparatus that includes the element.

Claims

1. a kind of generation method of conference speech record characterized by comprising

Receive the VoP and equipment identities data of conference speech；

Voice is carried out according to the VoP to convert to obtain text data；

Minutes are generated according to the equipment identities data and obtain code, so as to according to minutes acquisition code acquisition Text data.

2. generation method according to claim 1, which is characterized in that receive the VoP and equipment body of conference speech Before part data, further includes:

3. generation method according to claim 2, which is characterized in that acquire the recording of conference speech, and according to the record Sound generates the VoP

4. generation method according to claim 3, which is characterized in that carry out voice according to the VoP and convert It is specifically included to text data:

The segment text data is spliced into the text data.

5. generation method according to claim 3, which is characterized in that the segment text data is spliced into the text Data include:

Spliced in chronological order according to the corresponding timestamp of each voice data segment, generates the text data.

6. generation method according to claim 5, which is characterized in that when there are invalid voice data slot or damaging voice When data slot, further includes:

The determining secondary voice data segment with the timestamp same time；

The secondary voice data segment is replaced into the invalid voice data slot or described damages voice data segment；Wherein, Pair voice data segment and the invalid voice data slot described damage voice data segment and is adopted by different sound pick-up outfits Collection obtains.

7. generation method according to claim 1, which is characterized in that the minutes code is two dimensional code.

8. a kind of generation system of conference speech record characterized by comprising

Diostribution device is recorded, code is obtained for generating minutes according to the equipment identities data, so as to according to the meeting Record obtains code and obtains the text data.

9. generation system according to claim 8, which is characterized in that the data sink and recording conversion dress Setting is device in cloud server.

10. generation system according to claim 8, which is characterized in that further include:

Sound pick-up outfit is connected with the data sink, generates for acquiring the recording of conference speech, and according to the recording The VoP.

11. generation system according to claim 10, which is characterized in that the sound pick-up outfit is for receiving mobile terminal Network connection instruction after be connected to network, and the VoP is sent to by the data receiver by the network Device.

12. generation system according to claim 11, which is characterized in that when the sound pick-up outfit and network connection interruption When, the network connection unit in the sound pick-up outfit is jumped for Auto-reconnect or automatically.

13. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of generation methods as described in any item such as claim 1-7 are realized when being executed by processor.