Training in Bixby can be extended and enhanced by adding vocabulary. Vocabulary helps Bixby with the following:
Structures use patterns instead of vocabulary.
Imagine a user who is looking to watch a movie tonight. Instead of simply saying, "What movies are playing tonight?", the user might ask, "What comedies are playing tonight?" Your capsule can handle this case by supplying vocabulary.
To do this, create a new file of type Vocabulary within your capsule
(File > New File) in the language-specific resource folder the vocabulary belongs to (in this case, en
). This will create a new *.vocab.bxb
file under that folder.
If you are planning to localize your capsule, you must create a separate vocab
file for that concept within each language-specific resource folder your capsule supports.
You can then add a word/phrase list for the corresponding concept:
vocab (MovieGenre) {
"comedies"
"cartoons"
"animation"
"mysteries"
"romances"
"documentaries"
}
This list doesn't specify the only possible values for the MovieGenre
concept. It provides hints that help Bixby's natural language understanding when the user's utterance lacks clear context.
If you extend a type into another capsule, a new vocabulary file must still be created. Vocabulary is never inherited, even if you use extends
or add role-of
to a model. You can, however, add new vocabulary to an imported enum
. See Imported Vocabulary.
Vocabulary also allows you to add natural language words or phrases for symbols in enumerated primitive concepts (with some exceptions mentioned).
Suppose a weather capsule has symbols for weather that match a weather condition API:
enum (WeatherCondition) {
symbol (wcClear)
symbol (wcRain)
symbol (wcSnow)
symbol (wcHail)
symbol (wcSleet)
symbol (wcFog)
}
We might want to allow users to ask Bixby questions that directly refer to enumerated values:
"Is it going to rain tonight?"
Or:
"Is it going to be foggy tomorrow morning?"
To handle this, you must include a vocabulary file that specifies the symbols and their matching vocabulary terms:
vocab (WeatherCondition) {
"wcClear" { "clear" "sunny" }
"wcRain" { "rain" "raining" }
"wcSnow" { "snow" "snowing" }
"wcHail" { "hail" "hailing" }
"wcSleet" { "sleet" }
"wcFog" { "fog" "foggy" }
}
If you don’t have corresponding vocabulary for enum
concepts and tag them in a training example, that training example will not be learned.
With this vocabulary in place, utterances related to the weather conditions can be trained, and Bixby can understand related terms such as "fog" and "foggy".
Note that vocabulary terms have to be matched exactly for Bixby to understand them. The vocabulary above will match "snow" and "snowing" but not "snowy". Because of this limitation, it's best to avoid training on enum
symbols if you can avoid it in your capsule.
You can create vocabulary terms in this fashion for any closed type: enum
, boolean
, integer
, and decimal
.
If your capsule allows for changing the sort order of results, you can link vocabulary entries to sort keys. For example, if you are working on a hotel booking capsule, you could sort search results by price.
This example lists various ways a user could refer to expensive hotels (example.hotel.HighRate
):
vocab (example.hotel.HighRate) {
$sort-desc {
"expensive"
"extravagant"
"fancier"
"fancy"
"high priced"
"higher priced"
"lavish"
"luxurious"
"luxury"
"more costly"
"more expensive"
"most costly"
"most expensive"
"opulent"
"pricey"
...
}
}
When your capsule imports another capsule, it imports that capsule's vocabulary. For example, the viv.self
library has a Field
concept which is simply an enum
of field names:
enum (Field) {
symbol (firstName)
symbol (lastName)
symbol (structuredName)
symbol (firstAndLastName)
symbol (phoneInfo)
symbol (emailInfo)
symbol (addressInfo)
}
This is used internally by viv.self
to allow users to refer to individual fields in the Profile
model by using vocabulary associated with the symbols:
vocab (Field) {
"FirstName" { "first name" "given name" }
"LastName" { "last name" "family name" }
"EmailInfo" { "email" "email address" }
"PhoneInfo" { "phone number" "number" "phone" }
"AddressInfo" { "address" }
}
This list doesn't define vocabulary for StructuredName
or FirstAndLastName
; this is to allow capsules that import viv.self
to add their own vocabulary to cover those cases, as those might be domain-specific. The Space Resort capsule adds "name" as vocabulary for FirstAndLastName
:
vocab (self.Field) {
"FirstAndLastName" { "name" }
}
While vocabulary is imported, it is not inherited. If your capsule uses extends
or role-of
to build on an imported concept, any vocabulary for the base concept will not apply to your new concept.
Vocabulary is considered closed if all the vocabulary belongs to a closed set of specified terms, and open otherwise.
For open matching, if you tag a word or phrase in a training example as a specific concept and it is not explicitly listed in the vocab
, this matching is considered "out-of-vocabulary" (OOV).
The following types are considered "closed":
vocab
.true
and false
.The following types are considered "open":
This section discusses some of the limitations and restrictions for vocabulary.
A capsule can have a maximum of 75,000 vocabulary entries total (across all concepts). Each concept can have up to 50,000 vocabulary entries.
Bixby does not support pattern recognition in vocabulary. For example, let's assume you have a vocabulary list of MovieTitles
and a vocabulary list of ActorNames
. If your user has an utterance like "Show me Tom Cruise movies", unless you have "Tom Cruise" explicitly in your ActorNames
vocab, Bixby won't be able to distinguish "Tom Cruise" as an actor or a movie title. You should keep adding more and better training examples. If that doesn't work, then it is probably not reasonable to have Bixby separate all those types for you. Instead, you should have a single type like SearchQuery
and have your backend server handle doing the sorting and structured querying itself.
The appropriate casing for boolean symbols in vocab is true
and false
.
Do use all lower case.
Don't do any of the following:
True
, tRuE
, or any form of mixed caseTRUE
Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) are useful Bixby capabilities for communicating with users. However, the natural language utterances you train and the vocabulary you add are only used in their specific capsule (although they might be used over time to help improve ASR and TTS performance).