The Import and Search capsule template allows you to construct capsules that allow users to query underlying data tables through natural language. The queries can be relational, linking a main data table to supplementary data tables: a movie information capsule, for instance, could have a table of movies and a table of actors, letting you train it on such utterances as:
Tables are imported into the Import and Search template builder in CSV (comma-separated values) format. The template will create concepts named after the table column headers and infer types based on the values. That's not all, though: the Import and Search template also infers relationships between tables based on column names and data.
In the example for the movie information capsule, a
movies table might have the following structure:
|429617||Spider-Man: Far from Home||429617.jpg||"1136406,131,505710"||"Peter Parker and his friends ..."|
|420818||The Lion King||420818.jpg||"119589,14386,15152"||"Simba idolises his father, King Mufasa ..."|
|559969||El Camino: A Breaking Bad Movie||559969.jpg||"84497,88124,82945"||"In the wake of his dramatic escape ..."|
actors table might look like this:
The Import and Search template can infer that the
actors column in the
movies table refers to the
actors table. Since movies have multiple actors, this column contains array values linking to the actors by ID.
The actual "Movie Agent" sample data is generated from The Movie DB, and can be regenerated by a Python script included in the sample repository. The URLs for poster and profile images in the actual sample data are complete, working URLs; they're abbreviated here in the documentation for readability.
There are conventions you must follow when creating CSV files for the Import and Search template, as well some special syntax you can use to tweak how Bixby Developer Studio maps the table content to concepts.
name, or the first column in the table if no
namecolumn is found, is treated as the row's description: for example, the
namecolumn in the
moviestable above, and the
movies table above looks like this in CSV form:
id,name,posterImage,*actors*,-description 429617,Spider-Man: Far from Home,https://image.tmdb.org/429617.jpg,"1136406,131,505710","Peter Parker and his friends ..." 420818,The Lion King,420818.jpg,"119589,14386,15152","Simba idolises his father, King Mufasa ..." 559969,El Camino: A Breaking Bad Movie,559969.jpg,"84497,88124,82945","In the wake of his dramatic escape ..."
First, here are some general guidelines:
actorNameinstead of just
Id(or just named
id) will be assumed to be a unique identifier for the column. It's recommended all tables that describe single concepts, like
actors, have ID columns; tables that describe relationships, like
movies, should have columns that refer to ID columns in other tables.
To show how column names and data are used to derive column types, take the following example of a Hotel concept:
HotelName,Website,*Amenities*,Stars,Reviews,City,Price,CheckinTime The Peninsula,https://hotelpeninsula.com/,"pool,jacuzzi",4.0,164,Paris,$400,11:00am Regent Berlin,https://regentberlin.com/,"wifi,pets",5.0,463,Berlin,$300,11:30am Brown's Hotel,https://brownshotel.com/,wifi,4.5,268,London,$350,11:00am
Name(or is just named
name) will be given a
nameprimitive type, like
Description(or is just named
description) will be given a
pricewith numeric values will be mapped to
Reviewsabove, will be mapped to
Starsabove, will be mapped to
State, will be mapped to
gifwill be mapped to
Websiteabove, will be mapped to
AwayTeamcolumns; the generated
AwayTeamconcepts would have roles of
By default, properties in generated models are given default cardinality of
min (Optional) and
You can add a prefix character to column names to control input cardinality:
min (Required) max (One)
min (Optional) max (Many)
min (Required) max (Many)
-: do not add this column as an input
i:: add this column as an input, but exclude it from the generated concept
You can add a suffix character to column names to control output cardinality:
min (Required) max (One)
min (Optional) max (Many)
min (Required) max (Many)
-: do not add this column as an output property
In the hotel example, the
Amenities column has a
min (Optional) max (Many) cardinality, and is prefixed with
*. It also has array values in that column, which means it requires an output cardinality of
max (Many), and thus is suffixed with
* as well.
You could make
HotelName a required output by appending a
! to its column name:
And you could make all the properties required by prepending
! to the single-value ones and changing the
* prefix to
In practice, only inputs that are truly necessary for every search query should be set to a
min (Required) input cardinality.
A cell can contain comma-separated array values:
Since they contain commas, array values must be quoted in the actual CSV file. See the example in Cardinality above.
Columns with array values must be suffixed with
+ to set their cardinality to
max (Many). Otherwise they will default to
Max (One), and the whole quotes text will be interpreted as one value, such as the actual text
"pool,jacuzzi", rather than as multiple values!
Depending on your data sets, your capsule might need to handle queries that allow filtering based on values, similar to an SQL WHERE clause:
The Import and Search capsule template can generate models that handle these cases for you by using filters.
To define a filter, append a
[filter] tag to a column heading, such as:
[filter=filter1:filter2:...] to specify filtering operations that are allowed on the column. The available operations are:
min: allow specifying a minimum value.
max: allow specifying a maximum value.
eq: allow specifying an exact value.
You can use filtering operations on columns with text and numeric values. You cannot specify filters on
SearchRegion columns; the Query capsule will set up default filtering operations for those types automatically. (Date and location columns will still be searchable using all the flexibility built into
You can control whether the data from a column shows up on summary or detail cards by appending
[display=...] to the column name.
summary: the column should only appear in the summary view
details: the column should only appear in the details view
none: the column should not appear on either summary or detail cards
summary:details: the column should appear on both kinds of cards
The default is to appear on both (
Each column can have a label suffix to specify singular and plural forms for that column in the capsule's output language, of the form
This label is required for non-English languages. If you supply only one form (as in
If you need to specify multiple directives in brackets, they should appear one after another with no space between the brackets, such as:
Order is important:
When you start the Import and Search capsule template builder and choose a language for your capsule, you'll be presented with an Import and Search Template Screen.
You should load the main table, such as the
games table in the basketball schedule example, first, then load tables that columns in the main table refer to, such as the
games.csv) by clicking Upload CSV.
game:games. If you only specify one form, then in non-English languages, it will be assumed to be both singular and plural; in English, an "s" will be added to the form to make it plural (so
gamewill be automatically pluralized to
After the capsule is generated, Bixby Studio opens the
README file for your newly created capsule, with information about what models and basic training it's generated, along with some advice about next steps.
You can download sample CSV files for a variety of possible search capsules, including the NBA teams and schedules example, at the Bixby Developers GitHub repository.