Bixby Developer Center

Guides

Import and Search Capsule

Overview

The Import and Search capsule template allows you to construct capsules that allow users to query underlying data tables through natural language. The queries can be relational, linking a main data table to supplementary data tables: a basketball schedule capsule, for instance, could have a table of teams and a table of schedule information, letting you train it on such utterances as:

  • "When's the next Orlando Magic home game?"
  • "What stadium do the Charlotte Hornets play in?"
  • "Show me the Timberwolves schedule."

Tables are imported into the Import and Search template builder in CSV (comma-separated values) format. The template will create concepts named after the table column headers and infer types based on the values. That's not all, though: the Import and Search template also infers relationships between tables based on column names and data.

In the example for the basketball schedule capsule, a games table might have the following structure:

GameNamedatetimeHomeTeamAwayTeam
NOP @ TOR10/22/1917:00TORNOP
LAL @ LAC10/22/1919:30LACLAL
CHI @ CHA10/23/1916:00CHACHI

And a teams table might contain more information about each team:

TeamIdTeamCityNameTeamNameCityArena
ATLAtlantaHawksAtlantaState Farm Arena
BOSBostonCelticsBostonTD Garden
BRKBrooklynNetsNew York CityBarclays Center

The Import and Search template can infer that the HomeTeam and AwayTeam columns in the games table are references to the TeamId column in the teams table.

Table Format

There are conventions you must follow when creating CSV files for the Import and Search template, as well some special syntax you can use to tweak how Bixby Developer Studio maps the table content to concepts.

General

  • The CSV files must include a header row.
  • The CSV files must have values separated with commas, not tabs or any other separator value.
  • All values must be included: no empty cells.
  • The first column of primitive type name, or the first column in the table if no name column is found, is treated as the row's description: for example, the GameName column in the games table above, and the TeamCityName in the teams table.

The teams table above would look like this in CSV form:

TeamId,TeamCityName,TeamName,City,Arena
ATL,Atlanta,Hawks,Atlanta,State Farm Arena
BOS,Boston,Celtics,Boston,TD Garden
BRK,Brooklyn,Nets,New York City,Barclays Center

Mapping Columns to Concepts

  • It's best practice to give the columns names that describe their concepts: TeamName instead of just Name.
  • A column that ends in Id (or just named id) will be assumed to be a unique identifier for the column. It's recommended all tables that describe single concepts, like teams, have ID columns; tables that describe relationships, like games, should have columns that refer to ID columns in other tables.
  • A column that ends in Name (or is just named name) will be given a name primitive type.
  • A column that ends in Text or Description (or is just named text or description) will be given a text primitive type.
  • A column named price with numeric values will be mapped to viv.money.Currency.
  • A column ending with (or named) City, Country, or State, will be mapped to viv.geo.SearchRegion.
  • A column with date values will be mapped to viv.time.Date.
  • A column with time values will be mapped to viv.time.DetachedTime.

Cardinality

By default, properties in generated models are given default cardinality of min (Optional) and max (One).

You can add prefix characters to column names to control input cardinality:

  • !: min (Required) max (One)
  • *: min (Optional) max (Many)
  • +: min (Required) max (Many)
  • -: do not add this column as an input

You can add suffix characters to column names to control output cardinality:

  • !: min (Required) max (One)
  • *: min (Optional) max (Many)
  • +: min (Required) max (Many)
  • -: do not add this column as an output property

In the following example, the Amenities column has a min (Optional) max (Many) cardinality, and is prefixed with *. It also has array values in that column, which means it requires an output cardinality of max (Many), and thus is suffixed with * as well.

HotelName,*Amenities*,Stars,City
The Peninsula,"pool,jacuzzi",4,Paris
Regent Berlin,"wifi,pets",5,Berlin
Brown's Hotel,wifi,5,London

You could make HotelName a required output by appending a ! to its column name:

HotelName!,*Amenities*,Stars,City

And you could make all the properties required by prepending ! to the single-value ones and changing the * prefix to + for Amenities:

!HotelName!,+Amenities*,!Stars,!City

In practice, only inputs that are truly necessary for every search query should be set to a min (Required) input cardinality.

Array Values

A cell can contain comma-separated array values:

HotelNameAmenities*StarsCity
The Peninsula"pool,jacuzzi"4Paris
Regent Berlin"wifi,pets"5Berlin
Brown's Hotelwifi5London

Since they contain commas, array values must be quoted in the actual CSV file. See the example in Cardinality above.

Note

Columns with array values must be suffixed with * or + to set their cardinality to max (Many). Otherwise they will default to Max (One), and the whole bracketed text will be interpreted as one value, such as the actual text [pool,jacuzzi], rather than as multiple values!

Input Filters

Depending on your data sets, your capsule might need to handle queries that allow filtering based on values, similar to an SQL WHERE clause:

  • "Show me shoes between 50 and 100 dollars"
  • "Show me hotels of 3 stars or higher"
  • "Show me rooms with 2 beds"

The Import and Search capsule template can generate models that handle these cases for you by using filters.

To define a filter, append a [filter] tag to a column heading, such as:

HotelId,HotelName,RoomPrice[filter=min:max],Beds[filter=eq]

Use [filter=filter1:filter2:...] to specify filtering operations that are allowed on the column. The available operations are:

  • min: allow specifying a minimum value.
  • max: allow specifying a maximum value.
  • eq: allow specifying an exact value.

You can use filtering operations on columns with text and numeric values. You cannot specify filters on name, id, date, and SearchRegion columns; the Query capsule will set up default filtering operations for those types automatically. (Date and location columns will still be searchable using all the flexibility built into viv.time and viv.geo.)

Usage

When you start the Import and Search capsule template builder and choose a language for your capsule, you'll be presented with an Import and Search Template Screen.

You should load the main table, such as the games table in the basketball schedule example, first, then load tables that columns in the main table refer to, such as the teams table.

Search Capsule Entry

  1. Select the first CSV file (such as games.csv) by clicking Upload CSV.
  2. Enter a name for the concept that CSV file describes (such as Game).
  3. If you have more than one CSV file, click Add CSV and enter information for those files, too.
  4. When all CSV files are entered, click Next Step.
  5. Enter a capsule ID (such as playground.nbaSchedules).
  6. Choose a folder to save the new capsule in, or accept the default.
  7. Click Generate Capsule.

After the capsule is generated, Bixby Studio opens the README file for your newly created capsule, with information about what models and basic training it's generated, along with some advice about next steps.

Sample Import Data

You can download sample CSV files for a variety of possible search capsules, including the NBA teams and schedules example, at the Bixby Developers GitHub repository.

https://github.com/bixbydevelopers/sample-template-import-data