question

DomenicodEsposito-9844 avatar image
0 Votes"
DomenicodEsposito-9844 asked jgolde edited

Distorted microphone audio when the loudspeaker is enabled on iOS platform

Hi

I am maintaining a Push-to-talk VoIP app.
When a PTT call is running the app create an audio session

 m_AudioSession = AVAudioSession.SharedInstance();
    
 NSError error;
 if (!m_AudioSession.SetCategory(AVAudioSession.CategoryPlayAndRecord, AVAudioSessionCategoryOptions.DefaultToSpeaker | AVAudioSessionCategoryOptions.AllowBluetooth, out error))
 {
  IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the category");
 }
    
 if (!m_AudioSession.SetMode(AVAudioSession.ModeVoiceChat, out error))
 {
  IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the mode");
 }
    
 if (!m_AudioSession.OverrideOutputAudioPort(AVAudioSessionPortOverride.Speaker, out error))
 {
  IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error redirecting the audio to the loudspeaker");
 }
    
 if (!m_AudioSession.SetPreferredIOBufferDuration(0.06, out error)) // 60 milli seconds
 {
  IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the preferred buffer duration");
 }
    
 if (!m_AudioSession.SetPreferredSampleRate(8000, out error)) // kHz
 {
  IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error setting the preferred sample rate");
 }
    
 if (!m_AudioSession.SetActive(true, out error))
 {
  IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error activating the audio session");
 }

The received audio is played using the OutputAudioQueue and the microphone audio is captured (as mentioned in the Apple Doc: https://developer.apple.com/documentation/avfaudio/avaudiosession/mode/1616455-voicechat) using a Voice-Processing I/O Unit.
The initialization code for Voice-Processing I/O Unit is:

AudioStreamBasicDescription audioFormat = new AudioStreamBasicDescription()
{
SampleRate = SAMPLERATE_8000,
Format = AudioFormatType.LinearPCM,
FormatFlags = AudioFormatFlags.LinearPCMIsSignedInteger | AudioFormatFlags.LinearPCMIsPacked,
FramesPerPacket = 1,
ChannelsPerFrame = CHANNELS,
BitsPerChannel = BITS_X_SAMPLE,
BytesPerPacket = BYTES_X_SAMPLE,
BytesPerFrame = BYTES_X_FRAME,
Reserved = 0
};

             AudioComponent audioComp = AudioComponent.FindComponent(AudioTypeOutput.VoiceProcessingIO);
             AudioUnit.AudioUnit voiceProcessing = new AudioUnit.AudioUnit(audioComp);

             AudioUnitStatus unitStatus = AudioUnitStatus.NoError;

             unitStatus = voiceProcessing.SetEnableIO(true, AudioUnitScopeType.Input, ELEM_Mic);
             if (unitStatus != AudioUnitStatus.NoError)
             {
                 DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetEnableIO(true, AudioUnitScopeType.Input, ELEM_Mic) returned: {0}", unitStatus);
             }

             unitStatus = voiceProcessing.SetEnableIO(true, AudioUnitScopeType.Output, ELEM_Speaker);
             if (unitStatus != AudioUnitStatus.NoError)
             {
                 DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetEnableIO(false, AudioUnitScopeType.Output, ELEM_Speaker) returned: {0}", unitStatus);
             }


             unitStatus = voiceProcessing.SetFormat(audioFormat, AudioUnitScopeType.Output, ELEM_Mic);
             if (unitStatus != AudioUnitStatus.NoError)
             {
                 DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetFormat (MIC-OUTPUT) returned: {0}", unitStatus);
             }

             unitStatus = voiceProcessing.SetFormat(audioFormat, AudioUnitScopeType.Input, ELEM_Speaker);
             if (unitStatus != AudioUnitStatus.NoError)
             {
                 DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetFormat (ELEM 0-INPUT) returned: {0}", unitStatus);
             }

             unitStatus = voiceProcessing.SetRenderCallback(AudioUnit_RenderCallback, AudioUnitScopeType.Input, ELEM_Speaker);
             if (unitStatus != AudioUnitStatus.NoError)
             {
                 DammLogger.Log(DammLoggerLevel.Warn, TAG, "Audio Unit SetRenderCallback returned: {0}", unitStatus);
             }
                
             ...
                
             voiceProcessing.Initialize();
             voiceProcessing.Start();

And the RenderCallback function is:

 private AudioUnitStatus AudioUnit_RenderCallback(AudioUnitRenderActionFlags actionFlags, AudioTimeStamp timeStamp, uint busNumber, uint numberFrames, AudioBuffers data)
 {
     AudioUnit.AudioUnit voiceProcessing = m_VoiceProcessing;
     if (voiceProcessing != null)
     {
         // getting microphone input signal
         var status = voiceProcessing.Render(ref actionFlags, timeStamp, ELEM_Mic, numberFrames, data);
         if (status != AudioUnitStatus.OK)
         {
             return status;
         }

         if (data.Count > 0)
         {
             unsafe
             {
                 short* samples = (short*)data[0].Data.ToPointer();

                 for (uint idxSrcFrame = 0; idxSrcFrame < numberFrames; idxSrcFrame++)
                 {
                      ... send the collected microphone audio (samples[idxSrcFrame])
                 }
             }
         }
     }

     return AudioUnitStatus.NoError;
 }

I am facing the problem that if the loudspeaker is enabled:
if (!m_AudioSession.OverrideOutputAudioPort(AVAudioSessionPortOverride.Speaker, out error))
{
IOSErrorLogger.Log(DammLoggerLevel.Error, TAG, error, "Error redirecting the audio to the loudspeaker");
}
then the microphone audio is corrupted (some times is impossible to understand the speech).
If the loudspeaker is NOT enabled (the AVAudioSessionPortOverride.Speaker is not set) then the audio is very nice.

I have already verified that the NumberChannels in the AudioBuffer returned by the Render function is 1 (mono audio).

Any hit helping solved the problem is very appreciated. Thanks

Update: The AudioUnit_RenderCallback method is called every 32 ms. When the loudspeaker is disabled the received number of frames is 256 which is exact (sample rate is 8000). When the loudspeaker is enabled the received number of frames is 85. In both cases the GetAudioFormat returns the expected values: BitsPerChannel=16, BytesPerFrame=2, FramesPerPacket=1, ChannelsPerFrame=1, SampleRate=8000











dotnet-ios
· 5
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Try to add the line sampleRate = AudioSession.CurrentHardwareSampleRate; when init the AudioStreamBasicDescription . By the way , What's the system version of the device that you debug the app ?

1 Vote 1 ·

I tried to use AudioSession.SampleRate (CurrentHardwareSampleRate is deprecated by Apple) but this is not helping solving the issue. On the test device the AudioSession sample rate is 48000 when the loudspeaker is on and using this valie then sometimes I am completely missing the call-back from the AudioUnit. Anyway Apple Documentation say: “The audio unit provides format conversion between the hardware audio formats and your application audio format” (Using Specific Audio Units). The Audio format is applied to the AudioUnit’s output scope of the microphone.
The iOS version I am using in the tests is 14.4.1


0 Votes 0 ·
LucasZhang-MSFT avatar image LucasZhang-MSFT DomenicodEsposito-9844 ·

The issue in this case seems caused by the iOS native library .So you could post the issue to https://developer.apple.com/forums/ .

0 Votes 0 ·
jgolde avatar image jgolde DomenicodEsposito-9844 ·

To add some clarity to Lucas' reply above, Audio processing happens on device. Xamarin.iOS is only a wrapper around the iOS SDK and will not in any way affect audio quality on an iOS device. All Xamarin does is allow C# code to call into the iOS APIs, but all audio processing happens directly in iOS code, outside of the Xamarin.iOS domain. This is why we suggest you reach out to Apple for support on this.

0 Votes 0 ·
Show more comments

0 Answers