OpenVoiceOS Architecture
This section can be a bit technical, but is included for reference. It is not necessary to read this section for day-to-day usage of OVOS.
OVOS is a collection of modular services that work together to provide a seamless, private, open source voice assistant.
The suggested way to start OVOS is with systemd service files. Most of the images run these services as a normal user
instead of system wide. If you get an error when using the system files, try using it as a system service.
NOTE The ovos.service
is just a wrapper to control the other OVOS services. It is used here as an example showing --user
vs system
.
- user service
systemctl --user status ovos.service
- system service
systemctl status ovos.service
ovos-core
This service provides the main instance for OVOS and handles all of the skill loading, and intent processing.
All user queries are handled by the skills service. You can think of it as OVOS's brain
typical systemd command
systemctl --user status ovos-skills
systemctl --user restart ovos-skills
Messagebus
C++ version
NOTE This is an alpha
version and mostly Proof of Concept
. It has been known to crash often.
You can think of the bus service as OVOS's nervous system.
The ovos-bus
is considered an internal and private websocket, external clients should not connect directly to it. Please do not expose the messagebus to the outside world!
typical systemd command
systemctl --user start ovos-messagebus
Listener
The listener service is used to detect your voice. It controls the WakeWord, STT (Speech To Text), and VAD (Voice Activity Detection) Plugins. You can modify microphone settings and enable additional features under the listener section of your mycroft.conf
file, such as wake word / utterance recording / uploading.
The ovos-dinkum-listener is the new OVOS listener that replaced the original ovos-listener and has many more options. Others still work, but are not recommended.
typical systemd command
systemctl --user start ovos-dinkum-listener
STT Plugins
This is where speech is transcribed into text and forwarded to the skills service.
Two STT plugins may be loaded at once. If the primary plugin fails, the second will be used.
Having a lower accuracy offline model as fallback will account for internet outages, which ensures your device never becomes fully unusable.
Several different STT (Speech To Text) plugins are available for use. OVOS provides a number of public services using the ovos-stt-plugin-server plugin which are hosted by OVOS trusted members (Members hosting services). No additional configuration is required.
Hotwords
OVOS uses "Hotwords" to trigger any number of actions. You can load any number of hotwords in parallel and trigger different actions when they are detected. Each Hotword can do one or more of the following:
- trigger listening, also called a Wake word
- play a sound
- emit a bus event
- take ovos-core out of sleep mode, also called a wakeup_word or standup_word
- take ovos-core out of recording mode, also called a stop_word
WakeWord Plugins
A Wake word is what OVOS uses to activate the device. By default Hey Mycroft
is used by OVOS. Like other things in the OVOS ecosystem, this is configurable.
VAD Plugins
VAD Plugins detect when you are actually speaking to the device, and when you quit talking.
Most of the time, this will not need changed. If you are having trouble with your microphone hearing you, or stopping listening when you are done talking, you might change this and see if it helps your issue.
Audio
The audio service handles the output of all audio. It is how you hear the voice responses, music, or any other sound from your OVOS device.
TTS Plugins
TTS (Text To Speech) is the verbal response from OVOS. There are several plugins available that support different engines. Multiple languages and voices are available to use.
OVOS provides a set of public TTS servers hosted by OVOS trusted members (Members hosting services). It uses the ovos-tts-server-plugin, and no additional configuration is needed.
PHAL
PHAL stands for Plugin-based Hardware Abstraction Layer. It is used to allow access of different hardware devices access to use the OVOS software stack. It completely replaces the concept of hardcoded "enclosure" from mycroft-core
.
Any number of plugins providing functionality can be loaded and validated at runtime, plugins can be system integrations to handle things like reboot and shutdown, or hardware drivers such as mycroft mark 1 plugin
Admin PHAL
Similar to regular PHAL, but is used when sudo
or privlidged user
is needed
Be extremely careful when adding admin-phal plugins
. They give OVOS administrative privileges, or root privileges to your operating system
Admin PHAL
GUI
OVOS uses the standard mycroft-gui framework, you can find the official documentation here
The GUI service provides a websocket for GUI clients to connect to, it is responsible for implementing the GUI protocol under ovos-core
.
You can find in depth documentation here
Other OVOS services
OVOS provides a number of helper scripts to allow the user to control the device at the command line.
ovos-say-to
This provides a way to communicate an intent to ovos.ovos-say-to "what time is it"
ovos-listen
This opens the microphone for listening, just like if you would have said the WakeWord. It is expecting a verbal command.- Continue by speaking to your device
"what time is it"
ovos-speak
This takes your command and runs it through the TTS (Text To Speech) engine and speaks what was provided.ovos-speak "hello world"
will output"hello world"
in the configured TTS voiceovos-config
is a command line interface that allows you to view and set configuration values.
Newest (> Nov 2023) raspOVOS images only
- docs-community View the OVOS Community docs in the terminal
- docs-techincal View the OVOS Technical docs in the terminal
- docs-hivemind View the HiveMind docs in the terminal
- docs-messages View OVOS Message specs in the terminal