Rhasspy with openHAB, part 2: voice commands

Posted by

The bird’s eye view of what has to happen here is:

  1. define items in openHAB, once, and declare them as voice-related by assigning those items to specific groups
  2. Rhasspy gets a list of all voice-related openHAB items, and built command sentences accordingly (see below, How to tell Rhasspy what to expect)
  3. When I speak a voice command and it is recognized by Rhasspy, then Rhasspy will (via MQTT) set an openHAB string item specific for the category of voice command (e.g. switch, dimmer, queries, scenes) to a JSON-formatted text that contains all relevant information like the name of the item, the desired value of the item, the name of the satellite that picked up the voice command
  4. the update of the string item triggers an openHAB rule, which parses the JSON text and performs the requested action

That was the bird’s eye view. And now for the details …

How to tell Rhasspy what to expect

Whenever I add or edit an openHAB item that should be the subject of a voice interaction, I want the Rhasspy configuration to change automatically, to reflect the change in openHAB. For this purpose, I have defined several openHAB groups:

Group namePurpose
gVA voice-activated items like light switches that afford a “turn on”/”turn off” command, see Turning items on or off below
gVDvoice-controlled dimmer items that afford a “dim to x percent” command
gvQvoice-based queries that respond by speaking the state of an item, like temperature, see Voice questions and answers below
gVSvoice-based scene activation, i.e. turn on or off multiple lights based on a command like “let’s watch TV” or “let’s go to bed”

A script oh_items queries openHAB via its REST API for a list of all items, and outputs a list of all item names and item labels in a specified group.

This script is called repeatedly by my sentences.ini file, which looks like so

[SetOneLight]
(turn | switch) (on | off){state!upper} [the] ($oh_items,gVA){lightName} [please] 
(turn | switch) [the] ($oh_items,gVA){lightName} (on | off){state!upper} [please] 

[VoiceQuestion]
what is the ($oh_items,gVQ){topic!lower}
what ($oh_items,gVQ){topic!lower} is it

[VoiceScene]
lets ($oh_items,gVS){topic}

[SetDimmer]
set ($oh_items,gVD){itemName} dimmer (to | at) (10..100,10){percent} percent
dim ($oh_items,gVD){itemName} to (10..100,10){percent} percent

Every time you create new openHAB Items that you want controlled by voice commands, you need to revisit http://your-rhasspy-server-name:12101/ and click “Save Sentences”. This will re-run the scripts, and re-train the speech recognition engine.

On my Rhasspy installation, the script is stored in ~/.config/rhasspy/profiles/en/slot_programs/oh_item and contains

#!/usr/bin/python3
from requests import get
import sys,os
if (len(sys.argv)<2):
    print(f'No group name specified', file=sys.stderr)
    exit(1)
groupname = sys.argv[1]
print(f'Create Rhasspy slots from OpenHAB items in group {groupname} ...', file=sys.stderr)
# set OpenHAB REST API url to get a list of all items
url = "http://ha-server:8080/rest/items?recursive=false"
headers = {
    "content-type": "application/json",
}
response = get(url, headers=headers)
items = response.json()
for item in items:
    name = item['name']
    groups = item['groupNames']
    if groupname in groups:
        friendly_name = item['label']
        print(f"({friendly_name}):{name}")

Making openHAB listen to Rhasspy

I define a Thing, with channels for each category of voice commands, in things/rhasspy.things like so

Thing mqtt:topic:rh-mq:vc (mqtt:broker:rh-mq) {
  Channels:
    Type string: SetOneLight   [ stateTopic="hermes/intent/SetOneLight" ]
    Type string: SetDimmer     [ stateTopic="hermes/intent/SetDimmer" ]
    Type string: VoiceQuestion [ stateTopic="hermes/intent/VoiceQuestion" ]
    Type string: VoiceScene    [ stateTopic="hermes/intent/VoiceScene" ]
    Type string: StartTimer    [ stateTopic="hermes/intent/StartTimer" ]
}

and corresponding Items, one per voice command category, in items/rhasspy.items like so

String  Rhasspy_SetOneLight "Rhasspy Message"  { channel="mqtt:topic:rh-mq:vc:SetOneLight" }
String  Rhasspy_SetDimmer   "Rhasspy Dimmer"   { channel="mqtt:topic:rh-mq:vc:SetDimmer" }
String  Rhasspy_Question    "Rhasspy Question" { channel="mqtt:topic:rh-mq:vc:VoiceQuestion" }
String  Rhasspy_Scene       "Rhasspy Scene"    { channel="mqtt:topic:rh-mq:vc:VoiceScene" }

An example of a voice-activated item would the hallway light, defined in a .itemsfile like so

Switch HWL_Light_Proxy    "left hallway light"    <light>  (gVA)	

When I say “turn off right hallway light”, then Rhasspy will set item Rhasspy_SetOneLight to a long JSON string, like this

{
 "input": "turn OFF HWL_Light_Proxy", 
 "intent": {
  "intentName": "SetOneLight", 
  ... more stuff, deleted for brevity ...
 }, 
 "siteId": "raspi7", 
 "id": null, 
 "slots": [
  {
   "entity": "state", 
   "value": {"kind": "Unknown", "value": "OFF"}, 
   "slotName": "state", 
   "rawValue": "off", 
   ... more stuff, deleted for brevity ...
  }, 
  {
   "entity": "oh_items", 
   "value": {"kind": "Unknown", "value": "HWL_Light_Proxy"}, 
   "slotName": "lightName", 
   "rawValue": "left hallway light", 
   ... more stuff, deleted for brevity ...
  }
 ], 
 "sessionId": "raspi7-porcupine_raspberry-pi-98099623-5e84-4cc2-b7bc-d77567f09c14", 
 ... more stuff, deleted for brevity ...
 "rawInput": "turn off left hallway light", 
 ... more stuff, deleted for brevity ...
}

This contains all the information needed by openHAB: which item are we talking about? (HWL_Light_Proxy). What is the desired state ? (OFF). Which satellite picked up the voice command? (raspi7). What did I say? (“turn off left hallway light“).

Updating this string item triggers a rule in rules/rhasspy.rules, which looks like this

rule "Rhasspy SetOneLight message"
when 
    Item Rhasspy_SetOneLight received update 
then    
    val String json = newState.toString
    val String rawInput = transform("JSONPATH","$.rawInput", json)
    val String itemName = transform("JSONPATH","$.slots[?(@.entity=='oh_items')].value.value", json)
    val String itemState = transform("JSONPATH","$.slots[?(@.entity=='state')].value.value", json).toUpperCase
    val String siteId = transform("JSONPATH","$.siteId", json)
    logInfo("voice","Site {} heard '{}'", siteId, rawInput )

    // set the item as requested
    val theItem = gVA.members.findFirst[ t | t.name==itemName] as SwitchItem
    if (theItem !== null) {
        theItem.sendCommand(itemState)
    } else {
        logInfo("voice","ERROR: no item named '{}'", itemName)    
    }

    // where should the command acknowledgement be heard?
    val sinkName = transform("MAP","source_to_sink.map",siteId)
    val sinkItem = gSay.members.findFirst[ t | t.name=="say_"+sinkName] as StringItem
    if (sinkItem !== null) {
        sinkItem.sendCommand(rawInput)
    }
    logInfo("voice","ACTION: set '{}' to '{}', notify '{}'", itemName, itemState, sinkName)

    val roomName = transform("MAP","stt_to_room.map",siteId)
    val ttsName = transform("MAP","room_to_tts.map",roomName)
    logInfo("voice","stt {} in {}, reply to {}", siteId, roomName, ttsName)
end 

Since I want to have an acknowledgement spoken in the same room where the command was picked up, I need a map transformation from STT device to room, and from room to TTS device. The Raspberry Pies are capable of voice input and output, so the STT device and TTS device will be the same. The Willow box can only do speech recognition, so I need a separate box for voice output in the same room, so in this case the STT and TTS devices are separate.

Turning items on or off

The system will recognize voice commands like “turn the left hallway light on, please” or “switch off main bathroom fan” … provided you have defined openHAB Items with the label text “left hallway light” and “main bathroom fan“, and assigned them to group gVA.

Aye, aye, sir — how to acknowledge a voice command

I wanted the voice interaction system to acknowledge that a voice command has been received by repeating the command, after the action has been performed. Do do this, we just use voice announcement functionality described above to repeat the text that the Rhasspy speech recognition had reported.

One little complication: some satellites are input-only, and others are output-only, so the acknowledgement may need to be spoken by a different satellite from the one that heard the voice command. To deal with that, we have a MAP file that maps input siteIds to corresponding output siteIds.

In /etc/openhab/transform/source_to_sink.map , we have

# map voice recognition sites to notification sites
boxlite-A=espD
boxlite-B=raspi7
raspi7=raspi7
raspi11=raspi11
raspi14=raspi14
-=undefined
NULL=NULL

In /etc/openhab/rules/rhasspy.rules , in addition to what was shown above, the rule has one more section

rule "Rhasspy SetOneLight message"
when 
    Item Rhasspy_SetOneLight received update 
then 
    ...see above ...

    // where should the command acknowledgement be heard?
    val sinkName = transform("MAP","source_to_sink.map",siteId)
    val sinkItem = gSay.members.findFirst[ t | t.name=="say_"+sinkName]
    if (sinkItem !== null) {
        sinkItem.sendCommand(rawInput)
    }
end 

Leave a Reply