WO2021221440A1

WO2021221440A1 - Method for enhancing sound quality, and device therefor

Info

Publication number: WO2021221440A1
Application number: PCT/KR2021/005314
Authority: WO
Inventors: 이상훈; 김현욱; 양현철; 문한길; 박상수; 심환; 양재모; 이용훈
Original assignee: 삼성전자 주식회사
Priority date: 2020-04-28
Filing date: 2021-04-27
Publication date: 2021-11-04
Also published as: KR20210133004A

Abstract

An electronic device, according to one embodiment disclosed in the present document, comprises: a communication circuit; a plurality of microphones; a memory in which instructions are stored; and a processor operatively connected to the communication circuit, the plurality of microphones, and the memory, wherein the processor: performs a short-range communication connection supporting a first wireless communication protocol with a portable device through the communication circuit; acquires a plurality of voice signals through the plurality of microphones; generates a first packet having a data size defined in the first wireless communication protocol on the basis of the acquired plurality of voice signals, the first packet including a header, a first voice signal, a second voice signal, and an additional information block (side info) indicating that the first packet includes the first voice signal and the second voice signal; and transmits the first packet to the portable device through the communication circuit.

Description

Sound quality improvement method and device

Various embodiments disclosed in this document relate to a method and apparatus for improving sound quality in a voice call device using a wireless communication technology such as Bluetooth.

With the development of wireless communication technology, various types of wearable devices are being used together with smart phones. In particular, headsets used for listening to music or making calls are changing from a wired connection method to a wireless communication technology such as Bluetooth. When using a wireless headset, the number of microphones applied to the wireless headset is also increasing in order to improve sound quality and eliminate noise.

Unlike the case of making a call using a mobile terminal such as a smart phone, when making a call using a wireless headset, the number of voice signals transmitted from the wireless headset to the mobile terminal may be limited due to the nature of the wireless communication environment. For this reason, operations for improving sound quality and removing noise may be performed in a wireless headset using a low-power chipset rather than a portable terminal.

An electronic device according to an embodiment disclosed in this document includes a communication circuit; a plurality of microphones; a memory in which instructions are stored; a processor operatively coupled with the communication circuitry, the plurality of microphones, and the memory; The processor is configured to: perform a short-range communication connection supporting a first wireless communication protocol with a portable device through the communication circuit; acquiring a plurality of voice signals through the plurality of microphones, generating a first packet having a data size defined in the first wireless communication protocol based on the plurality of acquired voice signals, and One packet includes a header, a first voice signal, a second voice signal, and an additional information block (side info) indicating that the first packet includes the first voice signal and the second voice signal It may be an electronic device that transmits the first packet to the portable terminal through the communication circuit.

An electronic device according to an embodiment disclosed in this document includes a speech enhancement circuit; communication circuit; a memory in which instructions are stored; a processor operatively coupled with the speech enhancement circuitry, the communication circuitry, and the memory; The processor is configured to: establish a short-range communication connection supporting a first wireless communication protocol with a headset device through the communication circuit, and a plurality of voices acquired from the headset device through a plurality of microphones of the headset device through the communication circuit receive a first packet based on signals, wherein the first packet includes a header, a first voice signal, a second voice signal, and the first packet includes the first voice signal and the second voice signal The electronic device may include an additional information block (side info) indicating that a signal is included, and obtain the first voice signal and the second voice signal from the first packet.

A system according to an embodiment disclosed in this document includes a mobile terminal; a headset device connected to the mobile terminal through a short-range communication connection based on a first wireless communication protocol; The headset device is configured to: acquire a plurality of voice signals, and generate a first packet having a data size defined in the first wireless communication protocol based on the plurality of voice signals, and the first packet contains a header, a first voice signal, a second voice signal, and an additional information block (side info) indicating that the first packet includes the first voice signal and the second voice signal; transmits the first packet to the portable terminal through a first communication circuit, wherein the portable terminal responds to a plurality of voice signals obtained from the headset apparatus through a plurality of microphones of the headset apparatus through a second communication circuit Receive the first packet based on the first packet, obtain the first voice signal and the second voice signal from the first packet, and the first voice signal and the second voice signal through the second communication circuit may be a system that transmits the data to an external device.

Various embodiments disclosed in this document may provide a method and apparatus for improving sound quality and removing noise in a voice communication device using a wireless communication technology. Specifically, the wireless headset may transmit a packet including a plurality of microphone signals to the mobile terminal, and the method for transmitting a plurality of microphone signals proposed in this document changes only the payload structure of a packet transmitted in the existing Bluetooth standard method. By doing so, the existing standard transmission method can be used as it is. Standard mSBC (modified sub-band codec) and CVSD (continuous variable slope delta) codecs use all 60 bytes to transmit one signal. However, the wireless headset in this document uses a vendor-specific codec rather than the mSBC and CVSD codecs to provide the same or higher sound quality at 1/2 bit rate of mSBC and CVSD.

In addition, various effects directly or indirectly identified through this document may be provided.

1 illustrates a system in which a wireless headset transmits a plurality of voice signals to a portable terminal according to an embodiment.

2 is a block diagram of an electronic device according to an embodiment.

3 illustrates an example in which a wireless headset transmits a plurality of packets including a plurality of voice signals in one packet to a portable terminal according to an embodiment.

4 illustrates a configuration of a packet including a plurality of voice signals according to an embodiment.

5 is a flowchart illustrating an operation in which a wireless headset transmits one packet including a plurality of voice signals to a portable terminal according to an embodiment.

6 is a flowchart illustrating an operation in which a mobile terminal transmits a plurality of voice signals included in a packet received from a wireless headset to an external device according to an exemplary embodiment.

7 is a flowchart illustrating an operation of transmitting one packet including a plurality of voice signals between a first electronic device, a second electronic device, and an external device according to an embodiment.

8A illustrates a case in which voice processing is performed only in a mobile terminal according to an embodiment.

8B illustrates a first example of performing voice processing in a mobile terminal and a wireless headset according to an embodiment.

8C illustrates a second example of performing voice processing in a mobile terminal and a wireless headset according to an embodiment.

9 illustrates a method of processing a voice when a plurality of voice signals are acquired through a plurality of wearable devices according to an embodiment.

10 is a block diagram of an electronic device in a network environment according to an embodiment.

In connection with the description of the drawings, the same or similar reference numerals may be used for the same or similar components.

Hereinafter, various embodiments will be described with reference to the accompanying drawings. However, this is not intended to limit the specific embodiments, and it should be understood that various modifications, equivalents, and/or alternatives of the embodiments are included.

1 illustrates a system 130 in which a wireless headset transmits a plurality of voice signals to a portable terminal according to an embodiment.

Hereinafter, the system 130 in which one packet including a plurality of voice signals is transmitted from the electronic device 100 to the portable terminal 120 will be described.

According to an embodiment, the system 130 may include the electronic device 100 and the portable terminal 120 . The components of the system 130 are not limited to the components shown in FIG. 1 , and new components may be added or components shown in FIG. 1 may be changed or omitted. For example, at least a part of the description of the electronic device 1001 illustrated in FIG. 10 may be applied to the portable terminal 120 illustrated in FIG. 1 .

According to an embodiment, the electronic device 100 may include a plurality of microphones 110 . In addition, the electronic device 100 may function as a voice communication device using a short-range wireless communication technology. For example, the electronic device 100 may acquire a plurality of voice signals through the plurality of microphones 110 , and transmit one packet generated based on the acquired plurality of voice signals to the electronic device 100 . and may be transmitted to the mobile terminal 120 connected through short-range wireless communication. For example, the electronic device 100 may be understood as a wireless headset device capable of transmitting one packet including a plurality of voice signals to the portable terminal 120 using Bluetooth, which is short-range wireless communication.

According to an embodiment, the mobile terminal 120 receives one packet including a plurality of voice signals from the electronic device 100 (eg, a wireless headset device), and the plurality of voice signals included in the single packet. It may be understood as an electronic device that transmits data to an external device (eg, a mobile terminal of a call counterpart). For example, the mobile terminal 120 receives one packet including a plurality of voice signals from the wireless headset device, and transmits the plurality of voice signals included in the single packet to a mobile terminal of a call counterpart. can be understood as The mobile terminal 120 may not be limited to the above-described example. For example, the mobile terminal 120 may include at least one of a tablet personal computer (PC), a smart watch, and a wearable device.

The main scenario described below is described on the premise that a plurality of voice signals are transmitted from the electronic device 100 to the portable terminal 120 in one packet. The same can be applied to a case where signals are included in one packet and transmitted. According to an embodiment, the electronic device 100 may transmit a plurality of packets including a plurality of voice signals to the portable terminal 120 .

According to an embodiment, the electronic device 100 may include a microphone input module 111 , a first voice processing module 112 , a voice encoder 113 , and a packet configuration and transmission module 114 . A module included in the electronic device 100 may be understood as a hardware module (eg, a circuit) included in the electronic device 100 .

According to an embodiment, the microphone input module 111 may obtain and process a voice signal. The microphone input module 111 may provide a plurality of voice signals acquired through the plurality of microphones 110 to the first voice processing module 112 . For example, the electronic device 100 may provide two voice signals acquired through the microphone input module 111 including two microphones to the first voice processing module 112 .

According to an embodiment, the first voice processing module 112 may process and enhance the plurality of voice signals provided from the microphone input module 111 . The processing and improvement may refer to voice processing for improving the sound quality of a voice signal.

According to an embodiment, the first voice processing module 112 may be understood as speech enhancement that performs voice processing to improve the sound quality of a voice signal. The voice processing may be performed using at least one of a sound quality improvement method of echo cancellation (EC), noise reduction (NR), beamforming, or bandwidth extension (BWE). have.

According to an embodiment, the first voice processing module 112 may reduce the number of the plurality of voice signals. The electronic device 100 may reduce the number of the plurality of voice signals by performing at least one of beamforming and mixing through the first voice processing module 112 . For example, the electronic device 100 may reduce the number of voice signals from three to two or one by performing beamforming or mixing.

According to an embodiment, when the number of microphones in the electronic device 100 is two or less, the operation of reducing the number of the plurality of voice signals through the first voice processing module 112 may be omitted. For example, when the electronic device 100 acquires two voice signals through two microphones, the electronic device 100 performs an operation of reducing the number of voice signals through the first voice processing module 112 . can be omitted.

According to an embodiment, when the number of microphones is three or more, the electronic device 100 may reduce the number of the plurality of voice signals to two or less through the first voice processing module 112 . For example, when the electronic device 100 has three microphones, the electronic device 100 may obtain three voice signals (eg, a first signal, a second signal, and a third signal). The first signal and the second signal may be a signal obtained through a sub microphone, and the third signal may be referred to as a signal obtained through a main microphone. The electronic device 100 may perform voice processing corresponding to echo cancellation on the first signal, the second signal, and the third signal through the first voice processing module 112 . Also, the electronic device 100 may perform voice processing corresponding to beamforming on the first signal and the second signal among the first signal, the second signal, and the third signal from which the echo is removed. Through the beamforming, the first signal and the second signal may be converted into one signal. The electronic device 100 may transmit the third signal and the converted one signal to the portable terminal 120 . The mobile terminal 120 may mix the third signal and the one converted signal through the second voice processing module 124 to convert it into a single fourth signal. The mobile terminal 120 may perform voice processing corresponding to noise reduction on the fourth signal through the second voice processing module 124 . As another example, when the electronic device 100 has three microphones, the electronic device 100 may obtain three voice signals (eg, a first signal, a second signal, and a third signal). The first signal and the second signal may be a signal obtained through a sub microphone, and the third signal may be referred to as a signal obtained through a main microphone. The electronic device 100 may perform voice processing corresponding to echo cancellation on the first signal, the second signal, and the third signal. Also, the electronic device 100 may mix the first signal and the third signal to convert the first signal into a single signal. The electronic device 100 may transmit the second signal and the one converted signal to the portable terminal 120 . The mobile terminal 120 may perform a voice processing corresponding to beamforming on the second signal and the one converted signal through the second voice processing module 124 to convert it into a single fifth signal. The mobile terminal 120 may perform voice processing corresponding to noise reduction on the fifth signal through the second voice processing module 124 . The above-described voice processing operation and operation of reducing the number of voice signals are not limited to the above-described example.

According to an embodiment, the first voice processing module 112 may primarily process and improve the plurality of voice signals provided from the microphone input module 111 . Also, the first voice processing module 112 may reduce the number of the plurality of voice signals to two or less. The first voice processing module 112 may provide the processed and improved plurality of voice signals to the voice encoder 113 . For example, the first voice processing module 112 may primarily process and improve three voice signals provided from the microphone input module 111 . Also, the first voice processing module 112 may reduce the number of the three voice signals to two or less. The first voice processing module 112 may provide a voice signal in which the number of voice signals is reduced to two or less to the voice encoder 113 .

According to an embodiment, when the number of voice signals provided from the microphone input module 111 is two or less, the first voice processing module 112 responds to the voice signals provided through the first voice processing module 112 . It is possible to omit the processing and improvement process, and provide the voice signal to the voice encoder 113 . For example, when the number of the plurality of voice signals received from the microphone input module 111 is two, the first voice processing module 112 receives the plurality of voice signals through the first voice processing module 112 . It is possible to omit or perform the processing and improvement process for the voice signals, and provide the plurality of voice signals to the voice encoder 113 . For another example, when the number of the plurality of voice signals received from the microphone input module 111 is two, the operation of reducing the number of the plurality of voice signals to two or less is It may be omitted. According to an embodiment, the voice encoder 113 may receive a plurality of voice signals from the first voice processing module 112 or the microphone input module 111 . The voice encoder 113 may encode or encrypt a plurality of voice signals provided from the first voice processing module 112 or the microphone input module 111 . For example, the voice encoder 113 may encode two or more voice signals provided from the first voice processing module 112 . The encoding may be understood to mean that the data is processed so that the data can be transmitted through a wireless communication channel.

According to an embodiment, the voice encoder 113 may provide a plurality of encoded voice signals to the packet construction and transmission module 114 .

According to an embodiment, the packet construction and transmission module 114 may configure and generate a plurality of encoded voice signals provided from the voice encoder 113 into one packet, and use the one generated packet to a communication circuit. can be transmitted to an external device. For example, the packet construction and transmission module 114 may configure and generate two encoded voice signals provided from the voice encoder 113 into one packet, and transmit the one generated packet to the outside every predetermined time. The transmission may be performed to a device (eg, a mobile terminal through which the electronic device 100 and short-range wireless communication are connected).

According to an embodiment, the portable terminal 120 may include a packet reception module 122 , a voice decoder 123 , and a second voice processing module 124 .

According to an embodiment, the packet receiving module 122 may receive a packet from the electronic device 100 . For example, the packet receiving module 122 may receive a packet including a plurality of voice signals from the electronic device 100 at regular intervals.

According to an embodiment, the packet reception module 122 may provide a plurality of encoded voice signals present in a payload of a packet received from the electronic device 100 to the voice decoder 123 . For example, the packet reception module 122 may provide an encoded bitstream present in the payload of one packet received from the electronic device 100 to the voice decoder 123 . The payload may mean a data part serving as a fundamental purpose among transmitted data. For example, the payload may mean data corresponding to a plurality of voice signals excluding a header and metadata included in one packet.

According to an embodiment, the voice decoder 123 may decode a plurality of voice signals provided from the packet reception module 122 . The voice decoder 123 may provide the decoded voice signals to the second voice processing module 124 . The decoding means returning data received through a wireless communication channel to before being encoded, and may be understood as the reverse of encoding.

According to an embodiment, the second voice processing module 124 may process and improve the decoded voice signals provided from the voice decoder 123 . For example, the second voice processing module 124 may be understood as a speech enhancement that performs voice processing to improve the sound quality of a voice signal. The voice processing may be performed using at least one of a sound quality improvement method of echo cancellation, noise reduction, beamforming, and bandwidth expansion.

According to an embodiment, the second voice processing module 124 may process and improve voice signals using an algorithm having a higher complexity than the first voice processing module 112 . For example, the second speech processing module 124 may use a noise removal technique based on a neural network algorithm such as a deep neural network (DNN).

According to an embodiment, the mobile terminal 120 may transmit a plurality of voice signals processed and improved by the second voice processing module 124 to an external device (eg, a mobile terminal of a call counterpart) through a communication circuit. .

2 is a block diagram of an electronic device 100 according to an embodiment.

According to an embodiment, the electronic device 100 (eg, the electronic device 1001 or the electronic device 1002 of FIG. 10 ) includes the processor 210 (eg, the processor 1020 of FIG. 10 ) and the communication circuit 220 . ) (eg, the communication module 1090 of FIG. 10 ), a plurality of microphones 110 , a speech enhancement 240 , and a memory 250 (eg, the memory 1030 of FIG. 10 ). have. Components included in the electronic device 100 include components shown in the block diagram of FIG. 2 (eg, the processor 210 , the communication circuit 220 , the plurality of microphones 110 , and the speech enhancement ( 240), and memory 250). Components of the electronic device 100 illustrated in FIG. 2 may be replaced with other components or additional components may be added to the electronic device 100 .

According to an embodiment, the processor 210 executes instructions stored in the memory 250 to perform components of the electronic device 100 (eg, the communication circuit 220 , the plurality of microphones 110 , and speech enhancement). The operation of 240 and the memory 250 may be controlled. The processor 210 may be electrically and/or operatively coupled to the communication circuit 220 , the plurality of microphones 110 , the speech enhancement 240 , and the memory 250 . The processor 210 executes software to execute at least one other component connected to the processor 210 (eg, the communication circuit 220 , the plurality of microphones 110 , the speech enhancement 240 , and the memory 250 ). )) can be controlled. The processor 210 may obtain commands of various components of the electronic device 100 , interpret the obtained commands, and perform various data processing and/or operations according to the interpreted commands.

According to an embodiment, the communication circuit 220 uses a direct communication channel (eg, wired, Bluetooth (BT)) and/or an indirect communication channel (eg, access point (AP), base station) to the electronic device 100 . It is possible to support performing communication so as to transmit and receive data between the mobile terminal 120 and the mobile terminal 120 .

According to an embodiment, the plurality of microphones 110 may acquire a plurality of voice signals. The plurality of microphones 110 may provide the voice signals to the first voice processing module 112 or the voice encoder 113 based on the number of the acquired plurality of voice signals. For example, when the number of the plurality of microphones 110 is two or less, the plurality of microphones 110 may provide the obtained voice signals to the voice encoder 113 . For another example, when the number of the plurality of microphones 110 is three or more, the plurality of microphones 110 may provide the obtained voice signals to the first voice processing module 112 .

According to an embodiment, the speech enhancement 240 may be understood as a module that performs voice processing and enhancement in order to improve the sound quality of the acquired voice signal. The speech enhancement 240 may perform voice processing using at least one of a sound quality improvement method of echo cancellation, noise reduction, beamforming, and bandwidth expansion.

According to an embodiment, the memory 250 includes at least one component of the electronic device 100 (eg, the communication circuit 220 , the plurality of microphones 110 , the speech enhancement 240 , and the memory ( 250)) can store various data used by For example, the memory 250 may store data of a plurality of voice signals acquired by the electronic device 100 through the plurality of microphones 110 .

3 illustrates an example in which the wireless headset transmits a plurality of packets including a plurality of voice signals in one packet to the portable terminal 120 according to an embodiment.

According to an embodiment, the electronic device 100 transmits a plurality of voice signals (eg, mic0 ( 301 ) and mic1 ( 302 )) every predetermined time (eg, Frame 0, Frame 1, and Frame n, n is a natural number). may be included in one packet and transmitted to the mobile terminal 120 . The frame interval may mean a time interval at which the electronic device 100 transmits a packet to the mobile terminal 120, and mic0 (301) and mic1 (302) of frame 0 are two voices in one packet. It may mean that a signal is included. 3 illustrates an example in which two voice signals are included in one packet, but the present invention is not limited thereto, and three or more voice signals may be included in one packet.

According to an embodiment, the mobile terminal 120 may receive a packet including a plurality of voice signals every predetermined time (eg, a frame) from the electronic device 100 . The mobile terminal 120 may decode a plurality of voice signals included in the packet received from the electronic device 100 through the voice decoder 123 every predetermined time. For example, the mobile terminal 120 includes two voice signals (eg, mic0 and mic1) from the electronic device 100 every predetermined time (eg, frame 0, frame 1, and frames n and n are natural numbers). A packet may be received, and two voice signals obtained from the received packet may be decoded. Also, the mobile terminal 120 may obtain the decoded voice signals (eg, decoding mic0 (311) and decoding mic1 (312)).

According to an embodiment, the packet transmitted from the electronic device 100 to the portable terminal 120 includes an H2 header 400 , the first microphone encoded data 401 , and the second microphone encoded data. (Mic 2 encoded data, 402), and an additional information block (side info) 403 . The components of the packet are not limited to the components shown in FIG. 4, and new components may be changed. For example, the packet may further include metadata in the components shown in FIG. 4 .

According to an embodiment, the H2 header 400 of the packet means a header defined in the Bluetooth standard, and may have a size of 2 bytes.

According to an embodiment, the encoded data (eg, the first microphone encoded signal 401 and the second microphone encoded signal 402) of the plurality of voice signals (or the plurality of microphone signals) of the packet is H2 It may be stored after the header 400 . Although FIG. 4 shows the number of the encoded data to be two, it is not limited thereto and may include three or more. The packet may be understood as a synchronous connection-oriented (SCO) packet, and the SCO packet is a 2-EV3 packet, and may transmit data of up to 60 bytes per packet.

According to an embodiment, the side info block 403 includes information related to data of a plurality of encoded voice signals (eg, a first microphone encoded signal 401, a second microphone encoded signal 402). may include For example, the additional information block 403 includes the number of microphones included in the electronic device 100 and a voice signal ( or the number of voice channels), whether or not voice processing is performed through the speech enhancement 240 in the electronic device 100 (eg, a wireless headset), or used by the first voice processing module 112 of the electronic device 100 . It may include at least one piece of algorithm information. The algorithm information may refer to information related to which algorithm among echo cancellation, noise reduction, beamforming, and bandwidth extension is used to perform voice processing by the electronic device 100 .

A series of operations described below will be described on the assumption that the electronic device 100 transmits one packet including a plurality of voice signals to the mobile terminal 120 connected through short-range wireless communication (eg, Bluetooth). In addition, a series of operations described below may be simultaneously performed by the processor 210 or performed out of order, and some operations may be omitted or added.

In operation 501 according to an embodiment, the electronic device 100 may acquire a plurality of voice signals through the plurality of microphones 110 . For example, the electronic device 100 may acquire two or three voice signals through the plurality of microphones 110 .

In operation 503 according to an embodiment, the electronic device 100 may determine whether the number of the plurality of voice signals acquired through the plurality of microphones is three or more. If the number of the plurality of voice signals is three or more, the electronic device 100 may perform operation 505 , and if the number of the plurality of voice signals is less than or equal to two, the electronic device 100 may perform operation 509 .

In operation 505 according to an embodiment, when the number of the plurality of acquired voice signals is three or more, the electronic device 100 may improve the plurality of voice signals through the speech enhancement 240 . The improvement may mean voice processing for improving the sound quality of a voice signal. The voice processing performed by the electronic device 100 may be understood as first voice processing.

According to an embodiment, the electronic device 100 may reduce the number of the plurality of voice signals to one or two through the speech enhancement 240 . For example, when the electronic device 100 acquires three voice signals through a plurality of microphones, the electronic device 100 improves the three voice signals obtained through the speech enhancement 240, You can reduce the audio signal to one or two.

In operation 507 according to an embodiment, the electronic device 100 may extract additional information (side info) on the plurality of improved voice signals, and a voice encoder (eg, the voice encoder 113 of FIG. 1 ) ) to encode the plurality of improved speech signals.

In operation 509 according to an embodiment, if the number of the plurality of acquired voice signals is not three or more (eg, less than two), the electronic device 100 performs the plurality of voice signals through the speech enhancement 240 The operation of improving the voice signals of . Also, the electronic device 100 may extract side info on the plurality of voice signals and encode the plurality of voice signals through the voice encoder 113 .

According to an embodiment, in operation 511 , the electronic device 100 may generate a packet based on the encoded voice signals. The electronic device 100 may generate one packet further including the H2 header (eg, the header 400 of FIG. 4 ) and the additional information block 403 in the data of the plurality of encoded voice signals.

In operation 513 according to an embodiment, the electronic device 100 may transmit one generated packet to the portable terminal 120 through the communication circuit 220 . For example, the electronic device 100 may transmit the one generated packet to the mobile terminal 120 connected through short-range wireless communication (eg, Bluetooth).

According to an embodiment, the electronic device 100 includes a communication circuit 220 , a plurality of microphones 110 , a memory 250 in which instructions are stored, a communication circuit 220 , a plurality of microphones 110 , and a memory. and a processor 210 operatively coupled to 250 . The processor 210 of the electronic device 100 performs a short-range communication connection supporting the first wireless communication protocol with the portable terminal 120 through the communication circuit 220; Acquire a plurality of voice signals through the plurality of microphones 110, and generate a first packet (packet) having a data size defined in the first wireless communication protocol based on the acquired plurality of voice signals, , the first packet includes a header, a first voice signal, a second voice signal, and a side info block indicating that the first packet includes the first voice signal and the second voice signal. ), and may transmit the first packet to the portable terminal 120 through the communication circuit 220 .

According to an embodiment, a short-range communication connection supporting the first wireless communication protocol may be performed through Bluetooth (BT).

According to an embodiment, the electronic device 100 may include a speech enhancement circuit for enhancing the plurality of obtained voice signals.

According to an embodiment, the processor 210 determines that the speech enhancement is performed in at least one method of echo cancellation, noise reduction, beamforming, or bandwidth extension. Based on the obtained plurality of voice signals may be set to improve.

According to an embodiment, the electronic device 100 may include a voice encoder for encoding the plurality of acquired voice signals.

According to an embodiment, when the number of the plurality of microphones is three or more, the processor 210 acquires three or more voice signals through the plurality of microphones, and the three or more voice signals through speech enhancement and generate a second packet having a data size defined in the first wireless communication protocol based on the improved voice signals, and transmit the second packet to the portable terminal through the communication circuit. .

According to an embodiment, the processor 210 may process the three or more voice signals into two or less voice signals through the speech enhancement.

According to an embodiment, the processor 210 encodes the improved voice signals through a voice encoder, and based on the encoded voice signals, the second data having a size defined in the first wireless communication protocol. A packet may be generated and the second packet may be transmitted to the portable terminal through the communication circuit.

According to an embodiment, when the number of the plurality of microphones is two or less, the processor 210 acquires two or less voice signals through the plurality of microphones, and based on the two or less voice signals, A third packet having a data size defined in the first wireless communication protocol may be generated, and the third packet may be transmitted to the portable terminal through the communication circuit.

A series of operations described below will be described on the assumption that the mobile terminal 120 receives one packet including a plurality of voice signals from the electronic device 100 connected through short-range wireless communication (eg, Bluetooth). In addition, a series of operations described below may be simultaneously performed by the mobile terminal 120 or performed in a different order, and some operations may be omitted or added.

In operation 601 according to an embodiment, the mobile terminal 120 receives a plurality of microphones (eg, the plurality of microphones ( 110)), it is possible to receive a packet based on the plurality of voice signals obtained.

According to an embodiment, the mobile terminal 120 may receive one packet including encoded data of a plurality of voice signals acquired through a plurality of microphones included in the electronic device 100 through a communication circuit.

In operation 603 according to an embodiment, the mobile terminal 120 may obtain a first voice signal and a second voice signal from a packet including encoded data of a plurality of voice signals.

In operation 605 according to an embodiment, the mobile terminal 120 may analyze additional information (side info) on the acquired first voice signal and the acquired second voice signal, and use the voice decoder 123 to Through this, the first voice signal and the second voice signal may be decoded. The additional information may include information related to data of a plurality of acquired voice signals. For example, the additional information stored in the additional information block 403 is transmitted from the electronic device 100 to the portable terminal 120 using the number of microphones included in the electronic device 100 and a short-range communication connection (eg, Bluetooth). The number of voice signals (or voice channels) to be used, whether or not voice processing is performed through the speech enhancement 240 in the electronic device 100 (eg, a wireless headset), or the first voice processing module ( 112) may include at least one of the algorithm information used in the method.

In operation 607 according to an embodiment, the mobile terminal 120 may improve the decoded first voice signal and the decoded second voice signal through the speech enhancement 240 . The improvement may mean voice processing for improving the sound quality of the decoded first voice signal and the decoded second voice signal.

According to an embodiment, based on the number of a plurality of voice signals acquired through a plurality of microphones included in the electronic device 100 , the portable terminal 120 may display a voice included in a packet acquired from the electronic device 100 . signals can be improved. For example, when the number of the plurality of voice signals acquired by the electronic device 100 is three or more, the mobile terminal 120 performs the first voice processing module 112 (eg, speech enhancement) of the electronic device 100 . ) may receive a plurality of first voice-processed (or improved) voice signals. In addition, the mobile terminal 120 may perform second voice processing on the plurality of first voice-processed voice signals through the second voice processing module 124 of the mobile terminal 120 . As another example, when the number of the plurality of voice signals acquired by the electronic device 100 is two or less, the portable terminal 120 transmits the first voice through the first voice processing module 112 of the electronic device 100 . It is possible to receive a plurality of voice signals whose processing is omitted. Also, the mobile terminal 120 may perform second voice processing on the plurality of voice signals in which the first voice processing is omitted through the second voice processing module 124 of the mobile terminal 120 . The voice processing performed in the portable terminal 120 may be understood as second voice processing.

In operation 609 according to an embodiment, the mobile terminal 120 may transmit the improved first voice signal and the improved second voice signal to an external device (eg, a mobile terminal of a call counterpart) through a communication circuit. . For example, the external device may be understood as an electronic device capable of transmitting/receiving data of a voice signal through a short-distance or short-distance wireless communication channel.

According to an embodiment, the mobile terminal 120 performs the first voice processing and the second voice processing or A plurality of voice signals on which the second voice processing has been performed may be transmitted to an external device through a communication circuit.

According to an embodiment, the mobile terminal 120 includes a speech enhancement circuit, a communication circuit, a memory in which instructions are stored, the speech enhancement circuit, the communication circuit, and a processor operatively connected to the memory. may include. The processor of the mobile terminal 120 establishes a short-range communication connection supporting the first wireless communication protocol with the headset device through the communication circuit, and obtains from the headset device through the plurality of microphones of the headset device through the communication circuit Receive a first packet based on a plurality of voice signals, the first packet includes a header, a first voice signal, a second voice signal, and the first packet includes the first voice signal and the first voice signal Including an additional information block (side info) indicating that the second voice signal is included, it is possible to obtain the first voice signal and the second voice signal from the first packet.

According to an embodiment, the portable device 120 may include a voice decoder for decoding the first voice signal and the second voice signal.

According to an embodiment, the portable device 120 may enhance the decoded first voice signal and the decoded second voice signal through the speech enhancement.

According to an embodiment, the processor of the portable device 120 determines that the speech enhancement is at least one of echo cancellation, noise reduction, beamforming, and bandwidth extension. It may be set to improve the plurality of acquired voice signals based on the method of .

According to an embodiment, the portable device 120 may transmit the improved first voice signal and the improved second voice signal to an external device through the communication circuit.

According to an embodiment, the portable device 120 may perform a communication connection supporting a second wireless communication protocol with the external device through the communication circuit based on Bluetooth (BT) or cellular communication.

In operation 701 according to an embodiment, the first electronic device 700a may perform a short-range communication connection supporting the first wireless communication protocol with the second electronic device 700b through a communication circuit.

According to an embodiment, the first electronic device 700a may correspond to the electronic device 100 of FIG. 1 . For example, the first electronic device 700a may be referred to as a wireless headset that transmits one packet including a plurality of voice signals to the portable terminal 120 connected to the first electronic device 700a through Bluetooth.

According to an embodiment, the second electronic device 700b may correspond to the portable terminal 120 of FIG. 1 . For example, the second electronic device 700b connected to the first electronic device 700a through Bluetooth may be referred to as a mobile terminal that receives one packet including a plurality of voice signals from the first electronic device 700a. have.

In operation 703 according to an embodiment, the first electronic device 700a may acquire a plurality of voice signals through a plurality of microphones included in the first electronic device 700a. For example, the first electronic device 700a may acquire a first voice signal and a second voice signal through two microphones included in the first electronic device 700a.

In operation 705 according to an embodiment, the first electronic device 700a may improve the plurality of acquired voice signals through the speech enhancement 240 . The first electronic device 700a may determine whether to improve the acquired voice signal through the speech enhancement 240 based on the number of acquired voice signals. For example, when the number of the acquired voice signals is three or more, the first electronic device 700a may add the voice signal to the acquired voice signal through the speech enhancement 240 included in the first electronic device 700a. 1 Can perform voice processing. As another example, when the number of the acquired voice signals is two or less, the first electronic device 700a adds the acquired voice signal through the speech enhancement 240 included in the first electronic device 700a. The first voice processing to be performed may be omitted.

According to an embodiment, although not shown in FIG. 7 , the first electronic device 700a may encode a plurality of voice signals on which the first voice processing is performed or omitted through the voice encoder 113 .

In operation 707 according to an embodiment, the first electronic device 700a may generate a packet based on the plurality of improved voice signals. The first electronic device 700a may generate one packet including the plurality of encoded voice signals.

In operation 709 according to an embodiment, the first electronic device 700a may transmit one packet including the plurality of encoded voice signals to the second electronic device 700b through the communication circuit 220 .

In operation 711 according to an embodiment, the second electronic device 700b may receive one packet including a plurality of voice signals from the first electronic device 700a through a communication circuit. Also, the second electronic device 700b may obtain a plurality of voice signals from the received packet. For example, the second electronic device 700b may obtain one packet including the first voice signal and the second voice signal from the first electronic device 700a through a communication circuit. The second electronic device 700b may obtain the first voice signal and the second voice signal from the packet.

In operation 713 according to an embodiment, the second electronic device 700b may improve the plurality of voice signals through speech enhancement.

According to an embodiment, the second electronic device 700b analyzes side info included in one packet obtained from the first electronic device 700a, and performs speech enhancement based on the analysis result. Through this, it is possible to determine a voice processing method of a plurality of voice signals included in the acquired packet.

According to an embodiment, when the number of the plurality of voice signals acquired by the first electronic device 700a is two or less, the first electronic device 700a uses the first method for the acquired voice signals. The process of the first voice processing may be omitted, and the second electronic device 700b may perform the second voice processing using the second method on the voice signal in which the process of the first voice processing is omitted. For example, when the number of the plurality of voice signals is two or less, the second method may include at least one of echo cancellation, noise reduction, beamforming, and bandwidth extension.

According to an embodiment, when the number of the plurality of voice signals acquired by the first electronic device 700a is three or more, the first electronic device 700a uses the first method for the acquired voice signals. The first voice processing may be performed, and the second electronic device 700b may perform the second voice processing using the second method on the voice signal on which the first voice processing has been performed. For example, when the number of the plurality of voice signals is three or more, the first method may include a method of echo cancellation or noise reduction, and the second method may include a noise cancellation method.

According to an embodiment, in the second electronic device 700b, the number of voice signals acquired by the first electronic device 700a is three in the additional information included in one packet acquired from the first electronic device 700a. , and based on the identification result, the second electronic device 700b may perform second voice processing using a method of removing noise through speech enhancement.

According to an embodiment, in the second electronic device 700b, the number of voice signals acquired by the first electronic device 700a is two in the additional information included in one packet acquired from the first electronic device 700a. , and based on the identification result, the second electronic device 700b uses at least one of echo cancellation, noise reduction, beamforming, and bandwidth expansion through speech enhancement to perform a second Voice processing may be performed. In operation 715 according to an embodiment, the second electronic device 700b may transmit the plurality of improved voice signals to the external device 700c through a communication circuit.

In operation 717 according to an embodiment, the external device 700c may acquire the plurality of improved voice signals from the second electronic device 700b, and output the acquired plurality of improved voice signals to an output device (eg, : can be output through the speaker).

According to an embodiment, the system may include a first electronic device 700a and a second electronic device 700b connected to the first electronic device 700a through a short-range communication connection based on a first wireless communication protocol. . The first electronic device 700a acquires a plurality of voice signals, generates a first packet having a data size defined in the first wireless communication protocol based on the plurality of voice signals, and One packet includes a header, a first voice signal, a second voice signal, and an additional information block (side info) indicating that the first packet includes the first voice signal and the second voice signal However, the first packet may be transmitted to the second electronic device 700b through the first communication circuit. The second electronic device 700b receives the first packet based on a plurality of voice signals obtained from the first electronic device 700a through a plurality of microphones of the first electronic device 700a through a second communication circuit. ), obtain the first voice signal and the second voice signal from the first packet, and transmit the first voice signal and the second voice signal to an external device through the second communication circuit. .

According to an embodiment, when the number of the plurality of microphones included in the first electronic device 700a is three or more, the first electronic device 700a acquires three or more voice signals through the plurality of microphones, improving the three or more voice signals through speech enhancement, and generating a second packet having a data size defined in the first wireless communication protocol based on the improved voice signals, and via the communication circuit The second packet may be transmitted to the second electronic device 700b.

According to an embodiment, when the number of the plurality of microphones included in the first electronic device 700a is two or less, the first electronic device 700a acquires two or less voice signals through the plurality of microphones, , generates a third packet having a data size defined in the first wireless communication protocol based on the obtained two or less voice signals, and transmits the third packet through the communication circuit to the second electronic device 700b ) can be transmitted.

According to an embodiment, the second electronic device 700b may decode the first voice signal and the second voice signal through a voice decoder.

According to an embodiment, the second electronic device 700b improves the decoded first voice signal and the decoded second voice signal through speech enhancement, and improves the improved second voice signal through the second communication circuit. The first voice signal and the improved second voice signal may be transmitted to an external device.

8A illustrates a case in which voice processing is performed only in the portable terminal 120 according to an embodiment.

Hereinafter, when the electronic device 100 includes two or less microphones and acquires two or less voice signals through two or less microphones, one packet including two or less voice signals is transmitted to the mobile terminal ( 120) will be described.

According to an embodiment, the electronic device 100 includes two microphones 800a and may acquire two voice signals through the two microphones 800a.

According to an embodiment, based on the number of microphones included in the electronic device 100 and the number of voice signals acquired through the microphone, the electronic device 100 transmits the first voice through the first voice processing module 112 . You can decide whether to perform the processing or not. For example, as shown in FIG. 8A , based on two voice signals acquired through two microphones 800a included in the electronic device 100 , the electronic device 100 uses the first voice processing module ( 112), the first voice processing may be omitted.

According to an embodiment, the electronic device 100 may encode a plurality of voice signals omitting the first voice processing through the voice encoder 801a (eg, the voice encoder 113 of FIG. 1 ). The electronic device 100 may generate one packet including the plurality of encoded voice signals.

According to an embodiment, the electronic device 100 transmits one packet including the plurality of encoded voice signals to the mobile terminal 120 connected to the electronic device 100 through short-range wireless communication (eg, Bluetooth). can

According to an embodiment, the mobile terminal 120 may receive one packet including the plurality of encoded voice signals from the electronic device 100 , and transmit the plurality of voice signals included in the received one packet. can be obtained

According to an embodiment, the mobile terminal 120 may decode the plurality of voice signals through the voice decoder 802a (eg, the voice decoder 123 of FIG. 1 ).

According to an embodiment, the mobile terminal 120 may improve the plurality of decoded voice signals through the second voice processing module 803a (eg, the second voice processing module 124 of FIG. 1 ). Although not shown, the mobile terminal 120 may transmit the plurality of improved voice signals to an external device through a communication circuit.

Hereinafter, when the electronic device 100 includes three or more microphones and three or more voice signals are acquired through the three or more microphones, one packet including three or more voice signals is transmitted to the portable terminal 120 . explain the action.

According to an embodiment, the electronic device 100 includes three or more microphones 800b and may acquire three or more voice signals through the three or more microphones 800b.

According to an embodiment, based on the number of microphones included in the electronic device 100 and the number of voice signals acquired through the microphone, the electronic device 100 includes the first voice processing module 801b (eg, FIG. 1 ). It is possible to determine whether to perform voice processing through the first voice processing module 112). For example, as shown in FIG. 8B , the electronic device 100 uses the first voice processing module ( 801b), the first voice processing may be performed. The electronic device 100 may reduce the number of the three or more voice signals to two or less through the first voice processing. The first voice processing module 801b may provide the voice signals on which the first voice processing has been performed to the voice encoder 802b.

According to an embodiment, the voice encoder 802b may receive voice signals on which the first voice processing has been performed from the first voice processing module 801b. The electronic device 100 may encode the plurality of improved voice signals through the voice encoder 802b (eg, the voice encoder 113 of FIG. 1 ).

According to an embodiment, the electronic device 100 may generate one packet including the plurality of encoded voice signals.

According to an embodiment, the electronic device 100 transmits the single packet including the plurality of encoded voice signals to the mobile terminal 120 connected to the electronic device 100 through short-range wireless communication (eg, Bluetooth). can be transmitted

According to an embodiment, the mobile terminal 120 receives the single packet including the plurality of encoded voice signals from the electronic device 100 through a communication circuit (eg, the packet reception module 122 of FIG. 1 ). can receive

According to an embodiment, the mobile terminal 120 may acquire the plurality of voice signals based on the one received packet.

According to an embodiment, the mobile terminal 120 may decode the acquired plurality of voice signals through the voice decoder 803b (eg, the voice decoder 123 of FIG. 1 ).

According to an embodiment, the mobile terminal 120 performs second voice processing on the plurality of decoded voice signals through the second voice processing module 804b (eg, the second voice processing module 124 ). can The first voice processing may mean voice processing performed by the electronic device 100 , and the second voice processing may mean voice processing performed by the portable terminal 120 . In the first voice processing and the second voice processing, the content of the voice processing may be different according to a setting value of the device. For example, the electronic device 100 may perform a first voice processing on the plurality of voice signals using a method of echo cancellation or beamforming, and the portable terminal 120 may reduce noise on the plurality of voice signals. The second voice processing may be performed using the method of .

According to an embodiment, the mobile terminal 120 may transmit a plurality of voice signals on which the first voice processing and the second voice processing have been performed to an external device through a communication circuit.

The series of operations described in FIG. 8C is similar to the series of operations described in FIG. 8B, but FIG. 8C shows a plurality of voice signals through a sub mic 800c and a main mic 810c. The operations to be obtained will be described.

According to an embodiment, the electronic device 100 may include a sub-microphone 800c and a main microphone 810c. The sub-microphone 800c may include three or more microphones, and may be understood as a microphone for acquiring noise around the electronic device 100 . The main microphone 810c may include one microphone, and may be understood as a microphone for acquiring the user's voice of the electronic device 100 .

According to an embodiment, the electronic device 100 converts the plurality of voice signals (sub-microphone signals) acquired through the sub-microphone 800c to the first voice processing module 801c (eg, the first voice of FIG. 1 ). The first voice processing process may be performed through the processing module 112 . The electronic device 100 may change the plurality of voice signals into one voice signal through the first voice processing process. The electronic device 100 may encode one voice signal on which the first voice processing has been performed through the voice encoder 802c (eg, the voice encoder 113 of FIG. 1 ).

According to an embodiment, the electronic device 100 omits the first voice processing through the first voice processing module 801c on the voice signal (the main microphone signal) acquired through the main microphone 810c, and the voice encoder ( 802c) can be encoded.

According to an embodiment, the electronic device 100 transmits one packet including the first voice processing and the encoded sub-microphone signal and the main microphone signal omitting the first voice processing and encoding through the communication circuit to the portable terminal 120 ) can be transmitted.

According to an embodiment, the mobile terminal 120 may acquire a plurality of voice signals through one packet acquired from the electronic device 100 . For example, the mobile terminal 120 may obtain an encoded main microphone signal and an encoded sub microphone signal through one packet obtained from the electronic device 100 .

According to an embodiment, the mobile terminal 120 may decode the plurality of acquired voice signals through the voice decoder 803c (eg, the voice decoder 123 of FIG. 1 ).

According to an embodiment, the portable terminal 120 receives the sub-microphone signal obtained from the electronic device 100 through the second voice processing module 804c (eg, the second voice processing module 124 of FIG. 1 ). 2 can process voice.

According to an embodiment, the mobile terminal 120 obtains a main microphone signal in which the first voice processing is omitted from the electronic device 100 and transmits the main microphone signal to the second voice through the second voice processing module 804c. can be processed

According to an embodiment, the mobile terminal 120 may transmit the sub-microphone signal on which the second voice processing is performed and the main microphone signal on which the second voice processing is performed to an external device through a communication circuit. The operation of transmitting the sub-microphone signal and the main microphone signal to an external device may include an operation of transmitting a packet including the sub-microphone signal and the main microphone signal to the external device.

According to an embodiment, the electronic device 100 may include a plurality of wearable devices 900 including a plurality of microphones as well as a headset device. For example, the plurality of wearable devices may include smart glasses 901 , a wireless earphone 902 , or a smart watch 903 .

According to an embodiment, each of the plurality of devices included in the wearable devices 900 may include at least one microphone. The wearable devices 900 perform first voice processing and second voice processing through the first

voice processing modules

901a, 902a, and 903a based on the state of voice signals acquired through the plurality of microphones included in the plurality of devices. One of the wearable devices 900 suitable for performing the second voice processing through the second voice processing module 906 may be determined. The state of the voice signals may mean a degree of a ratio of an acquired voice signal to a noise signal. For example, when the ratio of the noise signal included in the voice signal acquired through the smart watch 903 is the smallest among the plurality of devices, the wearable devices 900 omit the first voice processing and the second voice signal The smart watch 903 may be determined as a suitable device for processing.

According to an embodiment, the wearable devices 900 may acquire a plurality of voice signals through a plurality of microphones. The wearable devices 900 may compare and analyze characteristics of the plurality of acquired voice signals. The characteristics of the voice signals may mean a ratio of a noise signal included in the voice signal. The wearable devices 900 may determine a main microphone and a sub-microphone among a plurality of microphones included in the wearable devices 900 based on the comparison and analysis. For example, the wearable devices 900 may compare the signal to noise ratio (SNR) of the voice signals, and the wearable devices 900 are included in the wearable devices 900 based on the comparison and analysis. Among the plurality of microphones, a microphone acquiring a voice signal having a large SNR may be determined as a main microphone, and a microphone acquiring a voice signal having a low SNR may be determined as a sub-microphone for noise cancellation. The SNR is a number representing a signal-to-noise ratio, and the larger the SNR, the smaller the noise ratio, and the smaller the SNR, the larger the noise ratio, and the unit may be decibels (dB). For example, when the SNR of the voice signal obtained through the smart watch 903 is high, the wearable devices 900 may determine the microphone included in the smart watch 903 as the main microphone. As another example, when the SNR of the voice signal obtained through the wireless earphone 902 is low, the wearable devices 900 may determine the microphone included in the wireless earphone 902 as a sub-microphone.

According to an embodiment, a device including a main microphone among a plurality of microphones included in the wearable devices 900 may encode a voice signal acquired through the main microphone through a voice encoder. The device including the main microphone may generate a packet including the encoded voice signal and transmit the packet to the portable terminal 120 through a communication circuit.

According to an embodiment, a device including a sub-microphone among a plurality of microphones included in the wearable devices 900 performs first voice processing on a voice signal acquired through the sub-microphone through a first voice processing module. can For example, when the device including the sub-microphone is the wireless earphone 902, the first voice processing of the voice signal acquired through the sub-microphone through the first voice processing module 902a included in the wireless earphone 902 can do.

According to an embodiment, the device including the sub-microphone may encode the voice signal on which the first voice processing has been performed through a voice encoder. The device including the sub-microphone may generate a packet including the encoded voice signal and transmit the packet to the portable terminal 120 through a communication circuit.

According to an embodiment, the portable terminal 120 includes a packet including a voice signal in which the first voice processing is omitted and a voice on which the first voice processing is performed from the device including the main microphone and the device including the sub-microphone, respectively. A packet including a signal may be received, and voice signals included in the packets may be obtained.

According to an embodiment, the mobile terminal 120 may decode the voice signals included in the packets through the voice decoder 904 .

According to an embodiment, the portable terminal 120 may perform a second voice processing on the decoded voice signals through the second voice processing module 906 .

According to an embodiment, the portable terminal 120 may transmit a voice signal on which the second voice processing has been performed to the external device 700 through a communication circuit.

10 is a block diagram of an electronic device 1001 in the network environment 1000 according to an embodiment.

Referring to FIG. 10 , in a network environment 1000 , the electronic device 1001 communicates with the electronic device 1002 through a first network 1098 (eg, a short-range wireless communication network) or a second network 1099 . It may communicate with the electronic device 1004 or the server 1008 through (eg, a long-distance wireless communication network). According to an embodiment, the electronic device 1001 may communicate with the electronic device 1004 through the server 1008 . According to an embodiment, the electronic device 1001 includes a processor 1020 , a memory 1030 , an input module 1050 , a sound output module 1055 , a display module 1060 , an audio module 1070 , and a sensor module ( 1076), interface 1077, connection terminal 1078, haptic module 1079, camera module 1080, power management module 1088, battery 1089, communication module 1090, subscriber identification module 1096 , or an antenna module 1097 . In some embodiments, at least one of these components (eg, the connection terminal 1078 ) may be omitted or one or more other components may be added to the electronic device 1001 . In some embodiments, some of these components (eg, sensor module 1076 , camera module 1080 , or antenna module 1097 ) are integrated into one component (eg, display module 1060 ). can be

The processor 1020, for example, executes software (eg, a program 1040) to execute at least one other component (eg, a hardware or software component) of the electronic device 1001 connected to the processor 1020. It can control and perform various data processing or operations. According to one embodiment, as at least part of data processing or computation, the processor 1020 converts commands or data received from other components (eg, the sensor module 1076 or the communication module 1090) to the volatile memory 1032 . may store the command or data stored in the volatile memory 1032 , and store the result data in the non-volatile memory 1034 . According to an embodiment, the processor 1020 is a main processor 1021 (eg, central processing unit or application processor) or a secondary processor 1023 (eg, a graphics processing unit, a neural network processing unit) a neural processing unit (NPU), an image signal processor, a sensor hub processor, or a communication processor). For example, when the electronic device 1001 includes the main processor 1021 and the auxiliary processor 1023 , the auxiliary processor 1023 uses less power than the main processor 1021 or is set to be specialized for a specified function. can The auxiliary processor 1023 may be implemented separately from or as part of the main processor 1021 .

The coprocessor 1023 may, for example, act on behalf of the main processor 1021 while the main processor 1021 is in an inactive (eg, sleep) state, or when the main processor 1021 is active (eg, executing an application). ), together with the main processor 1021, at least one of the components of the electronic device 1001 (eg, the display module 1060, the sensor module 1076, or the communication module 1090) It is possible to control at least some of the related functions or states. According to one embodiment, the coprocessor 1023 (eg, image signal processor or communication processor) may be implemented as part of another functionally related component (eg, camera module 1080 or communication module 1090). have. According to an embodiment, the auxiliary processor 1023 (eg, a neural network processing device) may include a hardware structure specialized for processing an artificial intelligence model. Artificial intelligence models can be created through machine learning. Such learning may be performed, for example, in the electronic device 1001 itself on which artificial intelligence is performed, or may be performed through a separate server (eg, the server 1008). The learning algorithm may include, for example, supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning, but in the above example not limited The artificial intelligence model may include a plurality of artificial neural network layers. Artificial neural networks include deep neural networks (DNNs), convolutional neural networks (CNNs), recurrent neural networks (RNNs), restricted boltzmann machines (RBMs), deep belief networks (DBNs), bidirectional recurrent deep neural networks (BRDNNs), It may be one of deep Q-networks or a combination of two or more of the above, but is not limited to the above example. The AI model may include, in addition to, or alternatively, a software structure in addition to the hardware structure.

The memory 1030 may store various data used by at least one component of the electronic device 1001 (eg, the processor 1020 or the sensor module 1076 ). The data may include, for example, input data or output data for software (eg, the program 1040 ) and instructions related thereto. The memory 1030 may include a volatile memory 1032 or a non-volatile memory 1034 .

The program 1040 may be stored as software in the memory 1030 , and may include, for example, an operating system 1042 , middleware 1044 , or an application 1046 .

The input module 1050 may receive a command or data to be used in a component (eg, the processor 1020 ) of the electronic device 1001 from the outside (eg, a user) of the electronic device 1001 . The input module 1050 may include, for example, a microphone, a mouse, a keyboard, a key (eg, a button), or a digital pen (eg, a stylus pen).

The sound output module 1055 may output a sound signal to the outside of the electronic device 1001 . The sound output module 1055 may include, for example, a speaker or a receiver. The speaker can be used for general purposes such as multimedia playback or recording playback. The receiver can be used to receive incoming calls. According to one embodiment, the receiver may be implemented separately from or as part of the speaker.

The display module 1060 may visually provide information to the outside (eg, a user) of the electronic device 1001 . The display module 1060 may include, for example, a display, a hologram device, or a projector and a control circuit for controlling the corresponding device. According to an embodiment, the display module 1060 may include a touch sensor configured to sense a touch or a pressure sensor configured to measure the intensity of a force generated by the touch.

The audio module 1070 may convert a sound into an electric signal or, conversely, convert an electric signal into a sound. According to an embodiment, the audio module 1070 acquires a sound through the input module 1050 or an external electronic device (eg, a sound output module 1055 ) directly or wirelessly connected to the electronic device 1001 . The electronic device 1002) (eg, a speaker or headphones) may output a sound.

The sensor module 1076 detects an operating state (eg, power or temperature) of the electronic device 1001 or an external environmental state (eg, user state), and generates an electrical signal or data value corresponding to the sensed state. can do. According to one embodiment, the sensor module 1076 may include, for example, a gesture sensor, a gyro sensor, a barometric sensor, a magnetic sensor, an acceleration sensor, a grip sensor, a proximity sensor, a color sensor, an IR (infrared) sensor, a biometric sensor, It may include a temperature sensor, a humidity sensor, or an illuminance sensor.

The interface 1077 may support one or more specified protocols that may be used to directly or wirelessly connect the electronic device 1001 with an external electronic device (eg, the electronic device 1002 ). According to an embodiment, the interface 1077 may include, for example, a high definition multimedia interface (HDMI), a universal serial bus (USB) interface, an SD card interface, or an audio interface.

The connection terminal 1078 may include a connector through which the electronic device 1001 can be physically connected to an external electronic device (eg, the electronic device 1002 ). According to an embodiment, the connection terminal 1078 may include, for example, an HDMI connector, a USB connector, an SD card connector, or an audio connector (eg, a headphone connector).

The haptic module 1079 may convert an electrical signal into a mechanical stimulus (eg, vibration or movement) or an electrical stimulus that the user can perceive through tactile or kinesthetic sense. According to an embodiment, the haptic module 1079 may include, for example, a motor, a piezoelectric element, or an electrical stimulation device.

The camera module 1080 may capture still images and moving images. According to one embodiment, the camera module 1080 may include one or more lenses, image sensors, image signal processors, or flashes.

The power management module 1088 may manage power supplied to the electronic device 1001 . According to an embodiment, the power management module 1088 may be implemented as, for example, at least a part of a power management integrated circuit (PMIC).

The battery 1089 may supply power to at least one component of the electronic device 1001 . According to one embodiment, battery 1089 may include, for example, a non-rechargeable primary cell, a rechargeable secondary cell, or a fuel cell.

The communication module 1090 is a direct (eg, wired) communication channel or a wireless communication channel between the electronic device 1001 and an external electronic device (eg, the electronic device 1002, the electronic device 1004, or the server 1008). It can support establishment and communication through the established communication channel. The communication module 1090 may include one or more communication processors that operate independently of the processor 1020 (eg, an application processor) and support direct (eg, wired) communication or wireless communication. According to one embodiment, the communication module 1090 is a wireless communication module 1092 (eg, a cellular communication module, a short-range wireless communication module, or a global navigation satellite system (GNSS) communication module) or a wired communication module 1094 (eg, : It may include a local area network (LAN) communication module, or a power line communication module). A corresponding communication module among these communication modules is a first network 1098 (eg, a short-range communication network such as Bluetooth, wireless fidelity (WiFi) direct, or infrared data association (IrDA)) or a second network 1099 (eg, legacy). It may communicate with the external electronic device 1004 through a cellular network, a 5G network, a next-generation communication network, the Internet, or a computer network (eg, a telecommunication network such as a LAN or WAN). These various types of communication modules may be integrated into one component (eg, a single chip) or may be implemented as a plurality of components (eg, multiple chips) separate from each other. The wireless communication module 1092 uses subscriber information (eg, International Mobile Subscriber Identifier (IMSI)) stored in the subscriber identification module 1096 within a communication network, such as the first network 1098 or the second network 1099 . The electronic device 1001 may be identified or authenticated.

The wireless communication module 1092 may support a 5G network after a 4G network and a next-generation communication technology, for example, a new radio access technology (NR). NR access technology includes high-speed transmission of high-capacity data (eMBB (enhanced mobile broadband)), minimization of terminal power and access to multiple terminals (mMTC (massive machine type communications)), or high reliability and low latency (URLLC (ultra-reliable and low-latency) -latency communications)). The wireless communication module 1092 may support a high frequency band (eg, mmWave band) to achieve a high data rate. The wireless communication module 1092 uses various techniques for securing performance in a high-frequency band, for example, beamforming, massive multiple-input and multiple-output (MIMO), all-dimensional multiplexing. It may support technologies such as full dimensional MIMO (FD-MIMO), an array antenna, analog beam-forming, or a large scale antenna. The wireless communication module 1092 may support various requirements specified in the electronic device 1001 , an external electronic device (eg, the electronic device 1004 ), or a network system (eg, the second network 1099 ). According to an embodiment, the wireless communication module 1092 may include a peak data rate (eg, 20 Gbps or more) for realizing eMBB, loss coverage (eg, 164 dB or less) for realizing mMTC, or U-plane latency for realizing URLLC (eg, 20 Gbps or more). Example: downlink (DL) and uplink (UL) each 0.5 ms or less, or round trip 1 ms or less) may be supported.

The antenna module 1097 may transmit or receive a signal or power to the outside (eg, an external electronic device). According to an embodiment, the antenna module 1097 may include an antenna including a conductor formed on a substrate (eg, a PCB) or a radiator formed of a conductive pattern. According to an embodiment, the antenna module 1097 may include a plurality of antennas (eg, an array antenna). In this case, at least one antenna suitable for a communication scheme used in a communication network such as the first network 1098 or the second network 1099 is connected from the plurality of antennas by, for example, the communication module 1090 . can be selected. A signal or power may be transmitted or received between the communication module 1090 and an external electronic device through the selected at least one antenna. According to some embodiments, other components (eg, a radio frequency integrated circuit (RFIC)) other than the radiator may be additionally formed as a part of the antenna module 1097 .

According to various embodiments, the antenna module 1097 may form a mmWave antenna module. According to one embodiment, the mmWave antenna module comprises a printed circuit board, an RFIC disposed on or adjacent to a first side (eg, bottom side) of the printed circuit board and capable of supporting a designated high frequency band (eg, mmWave band); and a plurality of antennas (eg, an array antenna) disposed on or adjacent to a second side (eg, top or side surface) of the printed circuit board and capable of transmitting or receiving signals of the designated high frequency band. can do.

At least some of the components are connected to each other through a communication method between peripheral devices (eg, a bus, general purpose input and output (GPIO), serial peripheral interface (SPI), or mobile industry processor interface (MIPI)) and a signal ( e.g. commands or data) can be exchanged with each other.

According to an embodiment, the command or data may be transmitted or received between the electronic device 1001 and the external electronic device 1004 through the server 1008 connected to the second network 1099 . Each of the external

electronic devices

1002 and 1004 may be the same as or different from the electronic device 1001 . According to an embodiment, all or a part of operations executed by the electronic device 1001 may be executed by one or more external

electronic devices

1002 , 1004 , or 1008 . For example, when the electronic device 1001 needs to perform a function or service automatically or in response to a request from a user or other device, the electronic device 1001 performs the function or service by itself instead of executing the function or service itself. Alternatively or additionally, one or more external electronic devices may be requested to perform at least a part of the function or the service. One or more external electronic devices that have received the request may execute at least a part of the requested function or service, or an additional function or service related to the request, and transmit a result of the execution to the electronic device 1001 . The electronic device 1001 may process the result as it is or additionally and provide it as at least a part of a response to the request. For this, for example, cloud computing, distributed computing, mobile edge computing (MEC), or client-server computing technology may be used. The electronic device 1001 may provide an ultra-low latency service using, for example, distributed computing or mobile edge computing. In another embodiment, the external electronic device 1004 may include an Internet of things (IoT) device. Server 1008 may be an intelligent server using machine learning and/or neural networks. According to an embodiment, the external electronic device 1004 or the server 1008 may be included in the second network 1099 . The electronic device 1001 may be applied to an intelligent service (eg, smart home, smart city, smart car, or health care) based on 5G communication technology and IoT-related technology.

The electronic device according to various embodiments disclosed in this document may have various types of devices. The electronic device may include, for example, a portable communication device (eg, a smart phone), a computer device, a portable multimedia device, a portable medical device, a camera, a wearable device, or a home appliance device. The electronic device according to the embodiment of the present document is not limited to the above-described devices.

It should be understood that the various embodiments of this document and the terms used therein are not intended to limit the technical features described in this document to specific embodiments, and include various modifications, equivalents, or substitutions of the embodiments. In connection with the description of the drawings, like reference numerals may be used for similar or related components. The singular form of the noun corresponding to the item may include one or more of the item, unless the relevant context clearly dictates otherwise. As used herein, "A or B", "at least one of A and B", "at least one of A or B", "A, B or C", "at least one of A, B and C", and "A , B, or C" each may include any one of the items listed together in the corresponding one of the phrases, or all possible combinations thereof. Terms such as “first”, “second”, or “first” or “second” may simply be used to distinguish the component from other components in question, and may refer to components in other aspects (e.g., importance or order) is not limited. It is said that one (eg, first) component is "coupled" or "connected" to another (eg, second) component, with or without the terms "functionally" or "communicatively". When referenced, it means that one component can be connected to the other component directly (eg by wire), wirelessly, or through a third component.

The term "module" used in various embodiments of this document may include a unit implemented in hardware, software, or firmware, and is interchangeable with terms such as, for example, logic, logic block, component, or circuit. can be used as A module may be an integrally formed part or a minimum unit or a part of the part that performs one or more functions. For example, according to an embodiment, the module may be implemented in the form of an application-specific integrated circuit (ASIC).

Various embodiments of the present document include one or more instructions stored in a storage medium (eg, internal memory 1036 or external memory 1038) readable by a machine (eg, electronic device 1001). may be implemented as software (eg, the program 1040) including For example, the processor (eg, the processor 1020 ) of the device (eg, the electronic device 1001 ) may call at least one command among one or more commands stored from a storage medium and execute it. This makes it possible for the device to be operated to perform at least one function according to the at least one command called. The one or more instructions may include code generated by a compiler or code executable by an interpreter. The device-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory' only means that the storage medium is a tangible device and does not contain a signal (eg, electromagnetic wave), and this term refers to the case where data is semi-permanently stored in the storage medium and It does not distinguish between temporary storage cases.

According to one embodiment, the method according to various embodiments disclosed in this document may be provided in a computer program product (computer program product). Computer program products may be traded between sellers and buyers as commodities. The computer program product is distributed in the form of a machine-readable storage medium (eg compact disc read only memory (CD-ROM)), or via an application store (eg Play Store ^TM ) or on two user devices ( It can be distributed (eg downloaded or uploaded) directly or online between smartphones (eg: smartphones). In the case of online distribution, at least a part of the computer program product may be temporarily stored or temporarily generated in a machine-readable storage medium such as a memory of a server of a manufacturer, a server of an application store, or a relay server.

According to various embodiments, each component (eg, a module or a program) of the above-described components may include a singular or a plurality of entities, and some of the plurality of entities may be separately disposed in other components. have. According to various embodiments, one or more components or operations among the above-described corresponding components may be omitted, or one or more other components or operations may be added. Alternatively or additionally, a plurality of components (eg, a module or a program) may be integrated into one component. In this case, the integrated component may perform one or more functions of each component of the plurality of components identically or similarly to those performed by the corresponding component among the plurality of components prior to the integration. . According to various embodiments, operations performed by a module, program, or other component are executed sequentially, in parallel, repeatedly, or heuristically, or one or more of the operations are executed in a different order, or omitted. or one or more other operations may be added.

Claims

In an electronic device,

communication circuit;

a plurality of microphones;

a memory in which instructions are stored;

a processor operatively coupled with the communication circuitry, the plurality of microphones, and the memory;

The processor is:

performing a short-range communication connection supporting a first wireless communication protocol with a portable device through the communication circuit;

Acquire a plurality of voice signals through the plurality of microphones,

A first packet having a data size defined in the first wireless communication protocol is generated based on the obtained plurality of voice signals, and the first packet includes a header, a first voice signal, and a first packet. 2 voice signals, and an additional information block (side info) indicating that the first packet includes the first voice signal and the second voice signal;

The electronic device transmits the first packet to the portable terminal through the communication circuit.
The method according to claim 1,

The short-range communication connection supporting the first wireless communication protocol is performed through Bluetooth (BT).
The method according to claim 1,

The electronic device further comprising a speech enhancement circuit for enhancing the plurality of obtained speech signals.
4. The method according to claim 3,

The processor determines that the speech enhancement is performed based on at least one of echo cancellation, noise reduction, beamforming, and bandwidth extension. An electronic device configured to improve signals.
The method according to claim 1,

The electronic device further comprising a voice encoder (encoder) for encoding the plurality of obtained voice signals.
The method according to claim 1,

When the number of the plurality of microphones is three or more, the processor:

Acquire three or more voice signals through the plurality of microphones,

improving the three or more voice signals through speech enhancement,

generating a second packet having a data size defined in the first wireless communication protocol based on the improved voice signals;

The electronic device transmits the second packet to the portable terminal through the communication circuit.
7. The method of claim 6,

and the processor processes the three or more voice signals into two or less voice signals through the speech enhancement.
7. The method of claim 6,

encoding the improved speech signals through a speech encoder;

generating the second packet having a size of data defined in the first wireless communication protocol based on the encoded voice signals;

The electronic device transmits the second packet to the portable terminal through the communication circuit.
The method according to claim 1,

When the number of the plurality of microphones is two or less, two or less voice signals are obtained through the plurality of microphones,

generating a third packet having a data size defined in the first wireless communication protocol based on the two or less voice signals;

The electronic device transmits the third packet to the portable terminal through the communication circuit.
In an electronic device,

speech enhancement circuitry;

communication circuit;

a memory in which instructions are stored;

a processor operatively coupled with the speech enhancement circuitry, the communication circuitry, and the memory;

The processor is:

performing a short-range communication connection supporting a first wireless communication protocol with a headset device through the communication circuit;

Receive a first packet based on a plurality of voice signals acquired through a plurality of microphones of the headset device from the headset device through the communication circuit, wherein the first packet includes a header, a first voice signal a signal, a second voice signal, and a side info indicating that the first packet includes the first voice signal and the second voice signal;

and obtaining the first voice signal and the second voice signal from the first packet.
11. The method of claim 10,

The electronic device further comprising a voice decoder for decoding the first voice signal and the second voice signal.
12. The method of claim 11,

An electronic device for enhancing the decoded first voice signal and the decoded second voice signal through the speech enhancement.
13. The method of claim 12,

The processor determines that the speech enhancement is performed based on at least one of echo cancellation, noise reduction, beamforming, and bandwidth extension. An electronic device configured to improve signals.
13. The method of claim 12,

An electronic device for transmitting the improved first voice signal and the improved second voice signal to an external device through the communication circuit.
11. The method of claim 10,

An electronic device that performs a communication connection supporting a second wireless communication protocol with an external device through the communication circuit based on BT (Bluetooth) or cellular communication.