Suppose you are looking for the best restaurant in an area, and you have reviews available from two different sources. Rather than displaying some restaurants twice, you need a way to merge duplicates. Bixby handles this by setting up equivalence definitions for each concept.
Because real-world inputs can be messy, a simple comparison is not enough to decide whether or not two inputs are equivalent. For example, you may want to treat two locations as the same as long as they are close together. Or, you may want to accept business names that contain minor typos or variations. You might also have complex structures where equivalence depends on a subset of the structure's properties, such as the name and the author.
To handle this, use an equivalence
definition, which specifies how the system
should compare two instances of the same concept. If the function returns
true
for two concept instances, the system will merge and present them as a
single instance. If the function returns false
, they are not the same
instance. The function may also return uncertain
when information is missing
or when fuzzy matching returns a value below the confidence threshold. At the
top-most level, only values that are considered true
matches will be merged.
You can modify this behavior when comparing structures.
The equivalence functions discussed below are only used for merging results. They are not used by the NLU system and have nothing to do with user input.
By default, two primitive values will match with true
when identical and
false
otherwise. The equivalence function fuzzy-string-equality
relaxes this
threshold for strings. Here's an example of this:
name (BusinessName) {
description (The name of a business.)
equivalence: fuzzy-string-equality {
true-tolerance (0.9)
uncertain-tolerance (0.7)
similarity-measure (Edit)
}
}
You can also set tolerances for float values (primitive type decimal
), using
fuzzy-numeric-equality
. The syntax is the same as fuzzy-string-equality
.
You cannot use non-numeric concepts with fuzzy-numeric-equality
.
You can learn more about primitive equivalence in reference documentation.
Comparing two structures is more complicated. By default, the system walks
through all the properties and compares each, descending into sub-properties as
needed. Each comparison uses any available equivalence definitions for the
properties. Comparison of structures with any missing properties will always
return uncertain
. Otherwise, comparison returns true
if and only if all
property comparisons return true
.
We can modify this behavior by defining equivalence as part of the concept structure. To do this, there are two primitive constraints and three conjunctions that join them together.
Here are the primitive constraints:
convertible-concepts
: This returns true if two concept instances
can be converted to each other. That works if both instances have
the same concept type, or if one is a sub-type of the other. For
example, a Business and a Restaurant are convertible types because
Restaurant extends Business. A Restaurant and a Movie Theater are
not convertible types to each other. They both extend Business, but
they have no inheritance relationship between each other.
equivalent-values(propertyName)
: This returns the equivalence
result for a specific property. The example below specifies that
Business comparison uses the name
and address
properties of the
business.
We use conjunctions to aggregate the results of other constraints:
join { constraint1 constraint2 ... }
: This is something like a
min
function across the nested constraints. If any nested
constraint returns false
, that is the result. Otherwise, if any
nested constrain returns uncertain
, that is the result. The result
is true
if and only if all the nested constraints return true
.
optimistic-join { constraint1 constraint2 ... }
: This modifies the
behavior of a join
by treating uncertain
as true
. It returns
true
if all the nested constraints return true
or uncertain
,
and false
otherwise. This conjunction never returns uncertain
.
pessimistic-join { constraint1 constraint2 ... }
: This modifies
the behavior of a join
by treating uncertain
as false
. It
returns true
if all the nested constraints return true
, and
false
otherwise. This conjunction never returns uncertain
.
Here are some examples of equivalence definitions:
structure (Business) {
property (address) {
type (viv.geo.Address)
}
// ... more properties ...
// Businesses get merged if their name and addresses match in a fuzzy
// way with an "uncertain" tolerance:
equivalence: optimistic-join {
convertible-concepts
equivalent-values (name)
equivalent-values (address)
}
}
The structure Business
allows comparison of convertible concepts,
which compares a business to a restaurant. For each instance, it compares only
the name and address. Aggregation is optimistic, so the result will be true
as long as the name and address comparisons return either true
or uncertain
.
This illustrates the utility of returning uncertain
. It may not seem very
useful when comparing two instances directly, but it can bubble up to any parent
concept comparison. For example, a Business name comparison might return
uncertain
, while the address comparison returns true
. The Business concept
specifies an optimistic join across these two properties, so the result would be
true
.
structure (GeoPoint) {
property (latitude) {
type (geo.Latitude)
min (Required)
}
property (longitude) {
type (geo.Longitude)
min (Required)
}
// The confidence for a point will be true, false or uncertain
// depending on the specified location tolerances.
equivalence: join {
fuzzy-numeric-equality(latitude) {
true-tolerance(0.00005)
uncertain-tolerance(0.005)
}
fuzzy-numeric-equality(longitude) {
true-tolerance(0.00005)
uncertain-tolerance(0.005)
}
}
}
The above structure GeoPoint
compares using the latitude
and longitude
properties. Two points are equivalent if and only if both properties
are within the specified tolerances. If either property comparison
returns false, the points are not equivalent. Otherwise, the result is
uncertain.
As a special case, you can define equivalence rules for GeoPoint
properties using the distance-equality
constraint. This returns
true if two points are within a specified geographic distance of each
other. In this example, the property centroid
is a GeoPoint, and
comparison returns true if two centroids are separated by 0.2 miles or
less.
equivalence: join {
distance-equality (centroid) {
unit (Miles)
magnitude (0.2)
}
}
You must use viv.core.BaseGeoPoint
concepts with distance-equality
.
You can learn more about structure equivalence in the reference documentation.