I've been working on software inspired by Robertsmania's live streams of races on Twitch, where he is able to use voice commands while he is driving to manipulate the view of the race being shown to his viewers. He can do things through voice commands like change the video to show a different car, select a different camera angle, or replay interesting bits of the race, all while driving the live race himself.
I have some of the basic functionality that Robertsmania has, but my system is not nearly as polished as his, I have a lot of rough edges while his looks and feels very professional. A couple of examples are that his voice command interactions sound much more conversational because of the spoken responses he has built into his system, and the interstitial video transitions between live racing and replays make the video presentation really nice.
We are both using purchased software called Voice Attack to handle translating voice commands into actions. Robertsmania told me that he's implemented his system using Voice Attack "plugins", which are described in the Voice Attack documentation by a 40-page section as being "for the truly mad." Voice Attack plugins are in C#, which I am not expert in and did not want to adopt at scale, so through many weeks of frustration and fumbling around, I figured out how to make a C# Voice Attack plugin call code I wrote in Go (which I also did not know but is very easy to use and learn -- since "go" is not usually going to be a useful search term on the internet, the community uses "golang" in all of its documentation to make it searchable). Once I got a proof of concept working with Voice Attack running my C# plugin calling my Golang DLL, I was (as it were) off to the races. After that, I split my Golang DLL into a client and server, which has a couple of advantages: I can run the server outside of the context of a Voice Attack plugin (which makes its services shareable with, for example, a SimHub plugin) which means I can more easily and quickly update and deploy a new version of the server, and I can add web pages to the server that show me information about what it's doing which makes working on it much easier.
At this point, I have some of the basic stuff that Robertsmania has, and like everything in software, even the easy stuff turns out to have wrinkles you didn't think about when you started.
Car and camera selection during the race: I can change the video using voice commands while driving to show a car by number or a camera by name. Showing cars by number turns out to be a bit of a headache in the general case, for two reasons: (a) in non-official sessions, iRacing allows three different cars to have the numbers "7", "07", and "007", so the car number is a string and not numeric; (b) Voice Attack limits the number of potential matches that can exist for a given command, so you can't have a command that recognizes all numbers from 0 to 999.
Basic replays of things that happened within the last minute: I can use a voice command to show a replay for a specific car for any amount of time within the last 60 seconds of the current live race.
Miscellaneous stream and video administrative commands: These don't matter to the viewer, but I can start and stop the stream, turn on or off the iRacing UI controls on the screen, and control what view of the streaming software I can see while I'm driving.
An ongoing large effort is to be able to detect and show interesting parts of the race automatically. For example, Robertsmania can say, "Show recent incidents for car 7", and the system will show a sequence of short replays of anything interesting it found for car 7. From observation and from talking to him, I have a good idea of what goes into this in his system, and I am working toward something that may be a little bit better. It involves processing some telemetry provided by the simulation for every car at a frequency of 60 times per second, and using that to detect when cars are doing something out of the ordinary, such as being stopped on the track during the race. iRacing does not tell you when a car has spun or crashed, but you can somewhat reliably detect it by analyzing speeds on the track and seeing when someone is going a lot slower than they should be.