Tip Adding an offline voice interface to a cross-platform .NET GUI

laves

New member
Joined
Dec 21, 2020
Messages
4
Programming Experience
5-10
For those who want to dive deeper, here's a tutorial I wrote and the source code!


Here's a little tutorial about how I added a voice interface to a cross-platform desktop app:



For those who want to dive deeper, here's a tutorial I wrote and the source code!

In a world where developers are racing to deliver MVPs and users are expecting seamless UX across their ever-expanding arsenal of devices, few things are as important to modern app developers as a good cross-platform framework. Up to this point, there has not been a strong solution for adding an offline voice-user interface (VUI) to your cross-platform application. Picovoice has recently removed this limitation for developers by releasing a .NET SDK that will allow you to add voice support across desktop platforms.

For .NET developers, the road to cross-platform support has been long and winding. The introduction of .NET Core delivered some great cross-platform possibilities, but it lacked a cross-platform GUI framework. Luckily, the community responded and created its own. AvaloniaUI is an amazing open-source project for .NET Core that already allows pixel-perfect rendering across all desktop platforms, with mobile platform support and WebAssembly browser support in active development.

As enormously complex as a cross-platform GUI may be under-the-hood, the result is still a collection of controls that require a user’s touch or keyboard input to interact with. The influence of the COVID-19 pandemic is undeniable and has brought with it a renewed desire for touchless interfaces in the market. In addition to this new interest in touchless technology, a touch-only interface narrows your app’s user base from an accessibility point of view and ignores the often desired hands-free use case. Voice-enabling a GUI is a great way to bring a whole new dimension to your interface.

In this article, I’m going to build a basic GUI with AvaloniaUI and then voice-enable it — without changing any of the UI code — with Picovoice’s Porcupine wake word engine. The UI is going to consist of some radio buttons that we’re going to toggle to change the color of the background. Let’s jump in!

The easiest way to get started with Avalonia is to download their Visual Studio Extension which will provide a designer and some project templates. Then, open Visual Studio and create a new project using the Avalonia MVVM Application template.

Image for post

The template will provide you with some basic app code and MVVM glue. For this tutorial, we’re only going to be adding code to MainWindow.axaml and MainWindowViewModel.cs.
In MainWindow.axaml, we’re going to build our UI. We’ll add four radio buttons in a centered stack panel (…I said it was going to be basic!). We’ll also add some bindings to the IsChecked property of the radio buttons and the Color property of the Window background brush. Our goal here is to have the window color change depending on which radio button is selected.

C#:
<Window xmlns="https://github.com/avaloniaui"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:vm="clr-namespace:AvaloniaVUI.ViewModels;assembly=AvaloniaVUI"
        xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
        xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
        mc:Ignorable="d" d:DesignWidth="800" d:DesignHeight="450"
        x:Class="AvaloniaVUI.Views.MainWindow"
        Icon="/Assets/avalonia-logo.ico"
        Title="AvaloniaVUI"
        Width="400"
        Height="400">
    <Window.Background>
        <SolidColorBrush Color="{Binding BackgroundColor}"/>
    </Window.Background>
    <Design.DataContext>
        <vm:MainWindowViewModel/>
    </Design.DataContext>
    <StackPanel HorizontalAlignment="Center"
                VerticalAlignment="Center">
        <RadioButton GroupName="WakeWords"
                     Margin="5"
                     Content="Grapefruit"
                     IsChecked="{Binding IsGrapefruit}"/>
        <RadioButton GroupName="WakeWords"
                     Margin="5"
                     Content="Grasshopper"
                     IsChecked="{Binding IsGrasshopper}"/>
        <RadioButton GroupName="WakeWords"
                     Margin="5"
                     Content="Bumblebee"
                     IsChecked="{Binding IsBumblebee}"/>
        <RadioButton GroupName="WakeWords"
                     Margin="5"
                     Content="Blueberry"
                     IsChecked="{Binding IsBlueberry}"/>
    </StackPanel>
</Window>
Now we can head to the view model MainWindowViewModel.cs and add the properties that we’ve specified bindings for. We’ll add a boolean property for each radio button and a color property for the window background. We’ll then add a list of four colors that we’ll use to change the background color depending on which radio button is selected. To change the background when a radio button is selected, we can add code to the IsChecked setters to swap the background color with one from our list of colors.
C#:
public class MainWindowViewModel : ViewModelBase
{       
    private Color _backgroundColor = Colors.LightGray;
    public Color BackgroundColor
    {
        get => _backgroundColor;
        set => this.RaiseAndSetIfChanged(ref _backgroundColor, value);
    }

    private readonly List<Color> _bgColors = new List<Color>
    {
        Colors.LightPink,
        Colors.LawnGreen,
        Colors.Yellow,
        Colors.BlueViolet
    };


    private bool _isGrapefruit;
    public bool IsGrapefruit
    {
        get => _isGrapefruit;
        set
        {
            this.RaiseAndSetIfChanged(ref _isGrapefruit, value);
            BackgroundColor = _bgColors[0];
        }
    }

    private bool _isGrasshopper;
    public bool IsGrasshopper
    {
        get => _isGrasshopper;
        set
        {
            this.RaiseAndSetIfChanged(ref _isGrasshopper, value);
            BackgroundColor = _bgColors[1];
        }
    }

    private bool _isBumblebee;
    public bool IsBumblebee
    {
        get => _isBumblebee;
        set
        {
            this.RaiseAndSetIfChanged(ref _isBumblebee, value);
            BackgroundColor = _bgColors[2];
        }
    }

    private bool _isBlueberry;
    public bool IsBlueberry
    {
        get => _isBlueberry;
        set
        {
            this.RaiseAndSetIfChanged(ref _isBlueberry, value);
            BackgroundColor = _bgColors[3];
        }
    }
}

If we run the app, we’ll see the four radio buttons and can change the window color by clicking on any of them.

Image for post

Now we’re going swap those clicks for voice commands. We’ll use Porcupine to detect which of the words we’ve said and then select the appropriate radio button. Add the Porcupine NuGet package to the project and create an instance of the wake word engine like so:
C#:
public MainWindowViewModel()
{
    List<string> commands = new List<string>
    {
        "grapefruit",
        "grasshopper",
        "bumblebee",
        "blueberry"
    };
    using Porcupine porcupine = Porcupine.Create(keywords: commands);
}

To detect the keywords, we need to capture frames of audio and pass them to Porcupine for processing. A great open-source, cross-platform audio library is OpenAL, which can be accessed through the OpenTK NuGet package. We’ll add that to our project and add some code to capture frames of audio for Porcupine using the default audio capture device. Each frame is passed to Porcupine for processing and an index is returned. If the index is -1, then no wake word was detected, but if the keyword index is 0 to 3, then one of our four wake words was detected. To select a radio button when the associated wake word is detected, we call the setter for that radio button’s IsChecked property. Finally, we’ll put the whole thing on a separate thread so that it doesn’t block the UI. Now our view model constructor will look like this:
C#:
public MainWindowViewModel()
{
    Task.Factory.StartNew(() =>
    {
        List<string> commands = new List<string> {
            "grapefruit",
            "grasshopper",
            "bumblebee",
            "blueberry"
        };
        using Porcupine porcupine = Porcupine.Create(keywords: commands);

        short[] frameBuffer = new short[porcupine.FrameLength];
        ALCaptureDevice captureDevice =
            ALC.CaptureOpenDevice(null,
                                  porcupine.SampleRate,
                                  ALFormat.Mono16,
                                  porcupine.FrameLength * 2);
        {
            ALC.CaptureStart(captureDevice);
            
            while (true)
            {
                int samplesAvailable = ALC.GetAvailableSamples(captureDevice);
                if (samplesAvailable > porcupine.FrameLength)
                {
                    ALC.CaptureSamples(captureDevice,
                                       ref frameBuffer[0],
                                       porcupine.FrameLength);
                    
                    int keywordIndex = porcupine.Process(frameBuffer);
                    if (keywordIndex >= 0)
                    {
                        switch (keywordIndex)
                        {
                            case 0:
                                IsGrapefruit = true;
                                break;
                            case 1:
                                IsGrasshopper = true;
                                break;
                            case 2:
                                IsBumblebee = true;
                                break;
                            case 3:
                                IsBlueberry = true;
                                break;
                        }
                    }
                }
            }
        }
    });
}


Launch the app and you should now be able to say any of the four commands and the radio button will select without any mouse input.

And there you have it — you’ve now added voice commands to an app without changing any of the UI code. In a real-world app, we could attach commands for “Save”, “Open File” or “Settings” and enable more complex inputs with Picovoice’s Rhino Speech-to-Intent engine. Adding this new dimension of interaction to your app has tremendous value and could be the feature that gives your app the edge in a crowded marketplace.

The full source code for this project can be found here.

For more information regarding Picovoice SDKs and products, visit our website, docs or explore our Github repositories.
 
Last edited by a moderator:
Thank you for your tutorial.

However, we can not guarantee how long the website you posted it on will be around. So It would be preferred if you posted the article content on this website instead. If the link expires or if that sites changes its URL's in the future, your article will be lost. And that doesn't help our readers, who might want to read it.
 
Thank you for your tutorial.

However, we can not guarantee how long the website you posted it on will be around. So It would be preferred if you posted the article content on this website instead. If the link expires or if that sites changes its URL's in the future, your article will be lost. And that doesn't help our readers, who might want to read it.
Thanks for the tip - I'll keep that in mind for next time!
 
Back
Top Bottom