Get active speakers within a call
During an active call, you may want to get a list of active speakers in order to render or display them differently. Here's how.
Prerequisites
- An Azure account with an active subscription. Create an account for free.
- A deployed Communication Services resource. Create a Communication Services resource.
- A user access token to enable the calling client. For more information, see Create and manage access tokens.
- Optional: Complete the quickstart to add voice calling to your application
Install the SDK
Use the npm install
command to install the Azure Communication Services Common and Calling SDK for JavaScript:
npm install @azure/communication-common --save
npm install @azure/communication-calling --save
Initialize required objects
A CallClient
instance is required for most call operations. When you create a new CallClient
instance, you can configure it with custom options like a Logger
instance.
With the CallClient
instance, you can create a CallAgent
instance by calling the createCallAgent
. This method asynchronously returns a CallAgent
instance object.
The createCallAgent
method uses CommunicationTokenCredential
as an argument. It accepts a user access token.
You can use the getDeviceManager
method on the CallClient
instance to access deviceManager
.
const { CallClient } = require('@azure/communication-calling');
const { AzureCommunicationTokenCredential} = require('@azure/communication-common');
const { AzureLogger, setLogLevel } = require("@azure/logger");
// Set the logger's log level
setLogLevel('verbose');
// Redirect log output to console, file, buffer, REST API, or whatever location you want
AzureLogger.log = (...args) => {
console.log(...args); // Redirect log output to console
};
const userToken = '<USER_TOKEN>';
callClient = new CallClient(options);
const tokenCredential = new AzureCommunicationTokenCredential(userToken);
const callAgent = await callClient.createCallAgent(tokenCredential, {displayName: 'optional Azure Communication Services user name'});
const deviceManager = await callClient.getDeviceManager()
How to best manage SDK connectivity to Microsoft infrastructure
The Call Agent
instance helps you manage calls (to join or start calls). In order to work your calling SDK needs to connect to Microsoft infrastructure to get notifications of incoming calls and coordinate other call details. Your Call Agent
has two possible states:
Connected - A Call Agent
connectionStatue value of Connected
means the client SDK is connected and capable of receiving notifications from Microsoft infrastructure.
Disconnected - A Call Agent
connectionStatue value of Disconnected
states there's an issue that is preventing the SDK it from properly connecting. Call Agent
should be re-created.
invalidToken
: If a token is expired or is invalidCall Agent
instance disconnects with this error.connectionIssue
: If there's an issue with the client connecting to Microsoft infrascture, after many retriesCall Agent
exposes theconnectionIssue
error.
You can check if your local Call Agent
is connected to Microsoft infrastructure by inspecting the current value of connectionState
property. During an active call you can listen to the connectionStateChanged
event to determine if Call Agent
changes from Connected to Disconnected state.
const connectionState = callAgentInstance.connectionState;
console.log(connectionState); // it may return either of 'Connected' | 'Disconnected'
const connectionStateCallback = (args) => {
console.log(args); // it will return an object with oldState and newState, each of having a value of either of 'Connected' | 'Disconnected'
// it will also return reason, either of 'invalidToken' | 'connectionIssue'
}
callAgentInstance.on('connectionStateChanged', connectionStateCallback);
Dominant speakers for a call is an extended feature of the core Call
API and allows you to obtain a list of the active speakers in the call.
This is a ranked list, where the first element in the list represents the last active speaker on the call and so on.
In order to obtain the dominant speakers in a call, you first need to obtain the call dominant speakers feature API object:
const callDominantSpeakersApi = call.feature(Features.CallDominantSpeakers);
Then, obtain the list of the dominant speakers by calling dominantSpeakers
. This has a type of DominantSpeakersInfo
, which has the following members:
speakersList
contains the list of the ranked dominant speakers in the call. These are represented by their participant ID.timestamp
is the latest update time for the dominant speakers in the call.
let dominantSpeakers: DominantSpeakersInfo = callDominantSpeakersApi.dominantSpeakers;
Also, you can subscribe to the dominantSpeakersChanged
event to know when the dominant speakers list has changed
const dominantSpeakersChangedHandler = () => {
// Get the most up to date list of dominant speakers
let dominantSpeakers = callDominantSpeakersApi.dominantSpeakers;
};
callDominantSpeakersApi.on('dominantSpeakersChanged', dominantSpeakersChangedHandler);
Handle the Dominant Speaker's video streams
Your application can use the DominantSpeakers
feature to render one or more of dominant speaker's video streams, and keep updating UI whenever dominant speaker list updates. This can be achieved with the following code example.
// RemoteParticipant obj representation of the dominant speaker
let dominantRemoteParticipant: RemoteParticipant;
// It is recommended to use a map to keep track of a stream's associated renderer
let streamRenderersMap: new Map<RemoteVideoStream, VideoStreamRenderer>();
function getRemoteParticipantForDominantSpeaker(dominantSpeakerIdentifier) {
let dominantRemoteParticipant: RemoteParticipant;
switch(dominantSpeakerIdentifier.kind) {
case 'communicationUser': {
dominantRemoteParticipant = currentCall.remoteParticipants.find(rm => {
return (rm.identifier as CommunicationUserIdentifier).communicationUserId === dominantSpeakerIdentifier.communicationUserId
});
break;
}
case 'microsoftTeamsUser': {
dominantRemoteParticipant = currentCall.remoteParticipants.find(rm => {
return (rm.identifier as MicrosoftTeamsUserIdentifier).microsoftTeamsUserId === dominantSpeakerIdentifier.microsoftTeamsUserId
});
break;
}
case 'unknown': {
dominantRemoteParticipant = currentCall.remoteParticipants.find(rm => {
return (rm.identifier as UnknownIdentifier).id === dominantSpeakerIdentifier.id
});
break;
}
}
return dominantRemoteParticipant;
}
// Handler function for when the dominant speaker changes
const dominantSpeakersChangedHandler = async () => {
// Get the new dominant speaker's identifier
const newDominantSpeakerIdentifier = currentCall.feature(Features.DominantSpeakers).dominantSpeakers.speakersList[0];
if (newDominantSpeakerIdentifier) {
// Get the remote participant object that matches newDominantSpeakerIdentifier
const newDominantRemoteParticipant = getRemoteParticipantForDominantSpeaker(newDominantSpeakerIdentifier);
// Create the new dominant speaker's stream renderers
const streamViews = [];
for (const stream of newDominantRemoteParticipant.videoStreams) {
if (stream.isAvailable && !streamRenderersMap.get(stream)) {
const renderer = new VideoStreamRenderer(stream);
streamRenderersMap.set(stream, renderer);
const view = await videoStreamRenderer.createView();
streamViews.push(view);
}
}
// Remove the old dominant speaker's video streams by disposing of their associated renderers
if (dominantRemoteParticipant) {
for (const stream of dominantRemoteParticipant.videoStreams) {
const renderer = streamRenderersMap.get(stream);
if (renderer) {
streamRenderersMap.delete(stream);
renderer.dispose();
}
}
}
// Set the new dominant remote participant obj
dominantRemoteParticipant = newDominantRemoteParticipant
// Render the new dominant remote participant's streams
for (const view of streamViewsToRender) {
htmlElement.appendChild(view.target);
}
}
};
// When call is disconnected, set the dominant speaker to undefined
currentCall.on('stateChanged', () => {
if (currentCall === 'Disconnected') {
dominantRemoteParticipant = undefined;
}
});
const dominantSpeakerIdentifier = currentCall.feature(Features.DominantSpeakers).dominantSpeakers.speakersList[0];
dominantRemoteParticipant = getRemoteParticipantForDominantSpeaker(dominantSpeakerIdentifier);
currentCall.feature(Features.DominantSpeakers).on('dominantSpeakersChanged', dominantSpeakersChangedHandler);
subscribeToRemoteVideoStream = async (stream: RemoteVideoStream, participant: RemoteParticipant) {
let renderer: VideoStreamRenderer;
const displayVideo = async () => {
renderer = new VideoStreamRenderer(stream);
streamRenderersMap.set(stream, renderer);
const view = await renderer.createView();
htmlElement.appendChild(view.target);
}
stream.on('isAvailableChanged', async () => {
if (dominantRemoteParticipant !== participant) {
return;
}
renderer = streamRenderersMap.get(stream);
if (stream.isAvailable && !renderer) {
await displayVideo();
} else {
streamRenderersMap.delete(stream);
renderer.dispose();
}
});
if (dominantRemoteParticipant !== participant) {
return;
}
renderer = streamRenderersMap.get(stream);
if (stream.isAvailable && !renderer) {
await displayVideo();
}
}
Install the SDK
Locate your project-level build.gradle file and add mavenCentral()
to the list of repositories under buildscript
and allprojects
:
buildscript {
repositories {
...
mavenCentral()
...
}
}
allprojects {
repositories {
...
mavenCentral()
...
}
}
Then, in your module-level build.gradle file, add the following lines to the dependencies
section:
dependencies {
...
implementation 'com.azure.android:azure-communication-calling:1.0.0'
...
}
Initialize the required objects
To create a CallAgent
instance, you have to call the createCallAgent
method on a CallClient
instance. This call asynchronously returns a CallAgent
instance object.
The createCallAgent
method takes CommunicationUserCredential
as an argument, which encapsulates an access token.
To access DeviceManager
, you must create a callAgent
instance first. Then you can use the CallClient.getDeviceManager
method to get DeviceManager
.
String userToken = '<user token>';
CallClient callClient = new CallClient();
CommunicationTokenCredential tokenCredential = new CommunicationTokenCredential(userToken);
android.content.Context appContext = this.getApplicationContext(); // From within an activity, for instance
CallAgent callAgent = callClient.createCallAgent(appContext, tokenCredential).get();
DeviceManager deviceManager = callClient.getDeviceManager(appContext).get();
To set a display name for the caller, use this alternative method:
String userToken = '<user token>';
CallClient callClient = new CallClient();
CommunicationTokenCredential tokenCredential = new CommunicationTokenCredential(userToken);
android.content.Context appContext = this.getApplicationContext(); // From within an activity, for instance
CallAgentOptions callAgentOptions = new CallAgentOptions();
callAgentOptions.setDisplayName("Alice Bob");
DeviceManager deviceManager = callClient.getDeviceManager(appContext).get();
CallAgent callAgent = callClient.createCallAgent(appContext, tokenCredential, callAgentOptions).get();
Dominant Speakers is an extended feature of the core Call object that allows the user to monitor the most dominant speakers in the current call. Participants can join and leave the list based on how they are performing in the call.
When joined to a group call consisting of multiple participants, the calling SDKs identify which meeting participants are currently speaking. Active speakers identify which participants are being heard in each received audio frame. Dominant speakers identify which participants are currently most active or dominant in the group conversation, though their voice is not necessarily heard in every audio frame. The set of dominant speakers can change as different participants take turns speaking, video subscription requests based on dominant speaker logic can be implemented.
The main idea is that as participants join, leave, climb up or down in this list of participants, the client application can take this information and customize the call experience accordingly. For example, the client application can show the most dominant speakers in the call in a different UI to separate from the ones that are not participating actively in the call.
Developers can receive updateds and obtain information about the most Dominant Speakers in a call. This information is being a represented as:
- An ordered list of the Remote Participants that represents the Dominant Speakers in the call.
- A timestamp marking the date when this list was last modified.
In order to use the Dominant Speakers call feature for Android, the first step is to obtain the Dominant Speakers feature API object:
DominantSpeakersFeature dominantSpeakersFeature = call.feature(Features.DOMINANT_SPEAKERS);
The Dominant Speakers feature object have the following API structure:
OnDominantSpeakersChanged
: Event for listening for changes in the dominant speakers list.getDominantSpeakersInfo()
: Gets theDominantSpeakersInfo
object. This object has:getSpeakers()
: A list of participant identifiers representing the dominant speakers list.getLastUpdatedAt()
: The date when the dominant speakers list was updated.
To subscribe to changes in the Dominant Speakers list:
// Obtain the extended feature object from the call object.
DominantSpeakersFeature dominantSpeakersFeature = call.feature(Features.DOMINANT_SPEAKERS);
// Subscribe to the OnDominantSpeakersChanged event.
dominantSpeakersFeature.addOnDominantSpeakersChangedListener(handleDominantSpeakersChangedlistener);
private void handleCallOnDominantSpeakersChanged(PropertyChangedEvent args) {
// When the list changes, get the timestamp of the last change and the current list of Dominant Speakers
DominantSpeakersInfo dominantSpeakersInfo = dominantSpeakersFeature.getDominantSpeakersInfo();
Date timestamp = dominantSpeakersInfo.getLastUpdatedAt();
List<CommunicationIdentifier> dominantSpeakers = dominantSpeakersInfo.getSpeakers();
}
Set up your system
Create the Visual Studio project
For a UWP app, in Visual Studio 2022, create a new Blank App (Universal Windows) project. After you enter the project name, feel free to choose any Windows SDK later than 10.0.17763.0.
For a WinUI 3 app, create a new project with the Blank App, Packaged (WinUI 3 in Desktop) template to set up a single-page WinUI 3 app. Windows App SDK version 1.3 or later is required.
Install the package and dependencies by using NuGet Package Manager
The Calling SDK APIs and libraries are publicly available via a NuGet package.
The following steps exemplify how to find, download, and install the Calling SDK NuGet package:
- Open NuGet Package Manager by selecting Tools > NuGet Package Manager > Manage NuGet Packages for Solution.
- Select Browse, and then enter
Azure.Communication.Calling.WindowsClient
in the search box. - Make sure that the Include prerelease check box is selected.
- Select the
Azure.Communication.Calling.WindowsClient
package, and then selectAzure.Communication.Calling.WindowsClient
1.4.0-beta.1 or a newer version. - Select the checkbox that corresponds to the Communication Services project on the right-side tab.
- Select the Install button.
Dominant Speakers is an extended feature of the core Call object that allows the user to monitor the most dominant speakers in the current call. Participants can join and leave the list based on how they are performing in the call.
When joined to a group call consisting of multiple participants, the calling SDKs identify which meeting participants are currently speaking. Active speakers identify which participants are being heard in each received audio frame. Dominant speakers identify which participants are currently most active or dominant in the group conversation, though their voice is not necessarily heard in every audio frame. The set of dominant speakers can change as different participants take turns speaking, video subscription requests based on dominant speaker logic can be implemented.
The main idea is that as participants join, leave, climb up or down in this list of participants, the client application can take this information and customize the call experience accordingly. For example, the client application can show the most dominant speakers in the call in a different UI to separate from the ones that are not participating actively in the call.
Developers can receive updateds and obtain information about the most Dominant Speakers in a call. This information is being a represented as:
- An ordered list of the Remote Participants that represents the Dominant Speakers in the call.
- A timestamp marking the date when this list was last modified.
In order to use the Dominant Speakers call feature for Windows, the first step is to obtain the Dominant Speakers feature API object:
DominantSpeakersCallFeature dominantSpeakersFeature = call.Features.DominantSpeakers;
The Dominant Speakers feature object have the following API structure:
OnDominantSpeakersChanged
: Event for listening for changes in the dominant speakers list.DominantSpeakersInfo
: Gets theDominantSpeakersInfo
object. This object has:Speakers
: A list of participant identifiers representing the dominant speakers list.LastUpdatedAt
: The date when the dominant speakers list was updated.
To subscribe to changes in the dominant speakers list:
// Obtain the extended feature object from the call object.
DominantSpeakersFeature dominantSpeakersFeature = call.Features.DominantSpeakers;
// Subscribe to the OnDominantSpeakersChanged event.
dominantSpeakersFeature.OnDominantSpeakersChanged += DominantSpeakersFeature__OnDominantSpeakersChanged;
private void DominantSpeakersFeature__OnDominantSpeakersChanged(object sender, PropertyChangedEventArgs args) {
// When the list changes, get the timestamp of the last change and the current list of Dominant Speakers
DominantSpeakersInfo dominantSpeakersInfo = dominantSpeakersFeature.DominantSpeakersInfo;
DateTimeOffset date = dominantSpeakersInfo.LastUpdatedAt;
IReadOnlyList<ICommunicationIdentifier> speakersList = dominantSpeakersInfo.Speakers;
}
Set up your system
Create the Xcode project
In Xcode, create a new iOS project and select the Single View App template. This quickstart uses the SwiftUI framework, so you should set Language to Swift and set Interface to SwiftUI.
You're not going to create tests during this quickstart. Feel free to clear the Include Tests checkbox.
Install the package and dependencies by using CocoaPods
Create a Podfile for your application, like this example:
platform :ios, '13.0' use_frameworks! target 'AzureCommunicationCallingSample' do pod 'AzureCommunicationCalling', '~> 1.0.0' end
Run
pod install
.Open
.xcworkspace
by using Xcode.
Request access to the microphone
To access the device's microphone, you need to update your app's information property list by using NSMicrophoneUsageDescription
. You set the associated value to a string that will be included in the dialog that the system uses to request access from the user.
Right-click the Info.plist entry of the project tree, and then select Open As > Source Code. Add the following lines in the top-level <dict>
section, and then save the file.
<key>NSMicrophoneUsageDescription</key>
<string>Need microphone access for VOIP calling.</string>
Set up the app framework
Open your project's ContentView.swift file. Add an import
declaration to the top of the file to import the AzureCommunicationCalling
library. In addition, import AVFoundation
. You'll need it for audio permission requests in the code.
import AzureCommunicationCalling
import AVFoundation
Initialize CallAgent
To create a CallAgent
instance from CallClient
, you have to use a callClient.createCallAgent
method that asynchronously returns a CallAgent
object after it's initialized.
To create a call client, pass a CommunicationTokenCredential
object:
import AzureCommunication
let tokenString = "token_string"
var userCredential: CommunicationTokenCredential?
do {
let options = CommunicationTokenRefreshOptions(initialToken: token, refreshProactively: true, tokenRefresher: self.fetchTokenSync)
userCredential = try CommunicationTokenCredential(withOptions: options)
} catch {
updates("Couldn't created Credential object", false)
initializationDispatchGroup!.leave()
return
}
// tokenProvider needs to be implemented by Contoso, which fetches a new token
public func fetchTokenSync(then onCompletion: TokenRefreshOnCompletion) {
let newToken = self.tokenProvider!.fetchNewToken()
onCompletion(newToken, nil)
}
Pass the CommunicationTokenCredential
object that you created to CallClient
, and set the display name:
self.callClient = CallClient()
let callAgentOptions = CallAgentOptions()
options.displayName = " iOS Azure Communication Services User"
self.callClient!.createCallAgent(userCredential: userCredential!,
options: callAgentOptions) { (callAgent, error) in
if error == nil {
print("Create agent succeeded")
self.callAgent = callAgent
} else {
print("Create agent failed")
}
})
Dominant Speakers is an extended feature of the core Call object that allows the user to monitor the most dominant speakers in the current call. Participants can join and leave the list based on how they are performing in the call.
When joined to a group call consisting of multiple participants, the calling SDKs identify which meeting participants are currently speaking. Active speakers identify which participants are being heard in each received audio frame. Dominant speakers identify which participants are currently most active or dominant in the group conversation, though their voice is not necessarily heard in every audio frame. The set of dominant speakers can change as different participants take turns speaking, video subscription requests based on dominant speaker logic can be implemented.
The main idea is that as participants join, leave, climb up or down in this list of participants, the client application can take this information and customize the call experience accordingly. For example, the client application can show the most dominant speakers in the call in a different UI to separate from the ones that are not participating actively in the call.
Developers can receive updateds and obtain information about the most Dominant Speakers in a call. This information is being a represented as:
- An ordered list of the Remote Participants that represents the Dominant Speakers in the call.
- A timestamp marking the date when this list was last modified.
In order to use the Dominant Speakers call feature for iOS, the first step is to obtain the Dominant Speakers feature API object:
let dominantSpeakersFeature = call.feature(Features.dominantSpeakers)
The Dominant Speakers feature object have the following API structure:
didChangeDominantSpeakers
: Event for listening for changes in the dominant speakers list.dominantSpeakersInfo
: Which gets theDominantSpeakersInfo
object. This object has:speakers
: A list of participant identifiers representing the dominant speakers list.lastUpdatedAt
: The date when the dominant speakers list was updated.
To subscribe to changes in the dominant speakers list:
// Obtain the extended feature object from the call object.
let dominantSpeakersFeature = call.feature(Features.dominantSpeakers)
// Set the delegate object to obtain the event callback.
dominantSpeakersFeature.delegate = DominantSpeakersDelegate()
public class DominantSpeakersDelegate : DominantSpeakersCallFeatureDelegate
{
public func dominantSpeakersCallFeature(_ dominantSpeakersCallFeature: DominantSpeakersCallFeature, didChangeDominantSpeakers args: PropertyChangedEventArgs) {
// When the list changes, get the timestamp of the last change and the current list of Dominant Speakers
let dominantSpeakersInfo = dominantSpeakersCallFeature.dominantSpeakersInfo
let timestamp = dominantSpeakersInfo.lastUpdatedAt
let dominantSpeakersList = dominantSpeakersInfo.speakers
}
}
Next steps
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for