NAV Navbar

Creating Mix.nlu models

Use Mix.nlu to build a highly accurate, high quality custom natural language understanding (NLU) system quickly and easily, even if you have never worked with NLU before.

About Mix.nlu

Mix.nlu enables you to author Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) models for your application. Mix.nlu models are deployed from the Mix Project Dashboard. Your application accesses these models with the ASR as a Service gRPC API and the NLU as a Service gRPC API.

Develop a Mix.nlu model

To develop a Mix.nlu model, you use Mix.dashboard and Mix.nlu as described in the workflow below.

Mix.nlu workflow

The following procedure summarizes the workflow for creating an NLU model and optionally a recognition-only domain language model (DLM):

  1. Create a project: The first step is to create a project in Mix.dashboard. This project contains all the data necessary for building your models.
  2. Develop your model: You then develop your model in Mix.nlu by creating your ontology and adding samples.
  3. Train your model: Training is the process of building a model based on the data that you have provided.
  4. Test your model: After you train your model, use the Try panel to test it interactively and tune it.
  5. Build your model: When you make a build, you create a model version, which is a snapshot of your model as it exists now.
  6. Create your application configuration: To use your model in an application, you create your application configuration, which is the combination of the model versions that you want to use in your application (for example, Mix.asr model v2 with Mix.nlu model v3 for project CoffeeMaker).
  7. Deploy your application configuration to an environment that is accessible by your application.
  8. Discover what your users say: Collect feedback on how well your model is performing by viewing how the model handled actual user utterances in the deployed application configuration.
  9. Circle back to step 2, refining the model based on insight from user data.

Open the project in Mix.nlu

To open a project in Mix.nlu:

  1. From Mix.dashboard, select your project in the Projects list.
  2. Click the .nlu icon.

About the Mix.nlu Develop tab

You use the Mix.nlu Develop tab to create intents and entities, add samples, try your model, and then train it.

Note the following:

Multiple language support

Mix.nlu supports multiple languages (or locales) per project. As you can imagine, sample phrases of what your users may say will differ from one language to another. Your samples, therefore, will be different per language/locale.

To filter the list of samples, select the language code from the menu near the name of your project. (If your project includes a single language, no menu appears.)

For example, this project supports three locales, with en_US currently selected:

multi-lang-select

Mix.nlu also allows you to define different literals for list-type entity values per language/locale. This allows you to support the various languages in which your users might ask for an item, such as "coffee", "café", or "kaffee" for a "drip" coffee. More information on how to do this is provided in the sections that follow.

Develop your model

To develop your model, you:

  1. Add intents to your model. An intent defines and identifies an intended action. An utterance or query spoken by a user will express an intent, for example, to order a drink. As you develop an NLU model, you define intents based on what you expect your users to do in your application.
  2. Add entities to your model. Entities identify details or categories of information relevant to your application. While the intent is the overall meaning of a sentence, entities and values capture the meaning of individual words and phrases in that sentence.
  3. Link your entities to your intents. Intents are almost always associated with entities that serve to further specify particulars about the intended action.
  4. Add samples. Samples are typical sentences that your users might say. They teach Mix how your users will interact with your application.
  5. Annotate your samples. Once you define entities in an ontology, you need to annotate the tokens within the samples so that the machine learns.
  6. Modify intents and annotations. Make any required modifications to your intents and annotations.
  7. Verify samples before training. As a final step, review the verification status of each sample phrase or sentence. This is an essential step that has a direct impact on the accuracy of the data used to create your model(s).

Next, you train, test, and finally build when you're ready to use your models in your client application.

Add intents to your model

An intent is something a user might want to do in your application. You might think of intents as actions (verbs); for example, to order. For more information about intents, see Intents.

To add intents to your model:

  1. In Mix.nlu, click the Develop tab.
  2. On the Intents bar, click the plus (+) icon to add an intent.
    add_project_icon
  3. Type the name of your intent (for example, OrderCoffee) and press Enter.

The intent name is added to the list of intents.

Add entities to your model

Entities collect additional important information related to your intent. You might think of entities as analogous to slots or parameters. For example, if the user's intent is to order an espresso drink, entities might include [CoffeeType], [CoffeeSize], [Flavor], and so on.

This procedure describes how to create custom entities, which are the entities that are specific to your Intent. You can also use one of the existing predefined entities, which are entities that have already been defined to save you the trouble of creating them from scratch. Examples of predefined entities include monetary amounts, Boolean values, calendar items (dates, times, or both), cardinal and ordinal numbers, and so on. For more information, see Predefined entities.

To simplify your model, avoid adding a unique entity for each instance of a similar item. Instead, add a single entity that describes a general type of item. For example, instead of defining entities for Cappuccino, Espresso, and Americano, define a single entity named CoffeeType.

When you add an entity, you specify the following information:

Field Description
Type Specifies the type of entity. Valid values are:
  • List: A list entity has possible values that can be enumerated in a list. For example, if you have defined an intent called OrderCoffee, the entity [CoffeeType] would have a list of the types of drinks that can be ordered. See List entities.
  • Relationship: A relationship entity has a specific relationship to an existing entity, either the "isA" or "hasA" relationship. See Relationship entities.
  • Freeform: A freeform entity is used to capture user input that you cannot enumerate in a list. For example, a text message body could be any sequence of words of any length. In the query "send a message to Adam hey I'm going to be ten minutes late", the phrase "hey I'm going to be ten minutes late" becomes associated with the freeform entity [MessageBody]. See Freeform entities.
  • Regex-based: A regex-based entity defines a set of values using a regular expression. For example, to match account numbers, postal (zip) codes, order numbers, and other pattern-based formats. See Regex-based.
Referenced as Defines how the entity can be referred to; for example, whether it is referring to a person (Contact, "him") or a place (City, "there"). These are used for handling anaphoras in dialogs.
Dynamic (Applies to list entities only) Indicates if the entity is dynamic or not. Dynamic list entities allow you to upload data dynamically at runtime. See Dynamic list entities.
Literals (Applies to list entities only) Lets you enter literals and values. A set of literals is the range of tokens in a user's query that corresponds to a certain entity. With literals, you can specify misspellings and synonyms for an entity's value. For example, in the queries "I'd like a large t-shirt" and "I'd like t-shirt, size L", the literals corresponding to the entity [ShirtSize] are "large" and "L", respectively. In both cases, the value is the same. Literals can be paired with values, which are then returned in the NLU interpretation result. For example, "small", "medium", and "large" can be paired with the values "S", "M", and "L". For projects that include multiple languages, you can specify variations per language/locale for an entity value.
See List entities for details.
Note: There is a limit to the number of literals that you can enter. See Limits for more information.

To add entities to your model:

  1. On the Entities bar, click the plus (+) icon.
    add_entity_icon
  2. Type the name of the entity (for example, CoffeeSize) and press Enter.
  3. Click the entity to see its details.
  4. Define the entity (see the table above for a description of the fields).

The next step is to link your entities to your intents so that they can be interpreted.

For example, if you have an intent called OrderCoffee that uses the CoffeeSize and CoffeeType entities, you need to link these entities with the OrderCoffee intent. You also need to link any predefined entities that you want to use.

To link entities to your intents:

  1. On the Intents bar, select the intent.
  2. Click the link entity plus (+) icon and select the entity to link.
    link_entity_to_intent
  3. Repeat for each entity that you want to link to the intent.

Add samples

Samples are typical phrases or sentences that your users might say. They teach Mix how your users think (their mental models) when interacting with your application.

If your project includes multiple languages, be sure to select the appropriate locale before you start to enter samples.

multi-lang-select

You can enter a maximum of 500 characters per sample.

To add samples:

  1. (As required) Select the locale from the menu near the name of the project.
  2. In the Intents area, click the name of the intent.
  3. In the "The user says" field, type a sample utterance and press Enter. For example, "I want a double espresso."
  4. Repeat this procedure a few times.

The more samples you include, the better your model will become at interpreting.

For optimal machine learning, samples should be based on data of real-world usage. For information on importing samples, see Import data. Also see Creating and Annotating Datasets for Optimal Accuracy.

Annotate your samples

The final step is to annotate the literals in your samples with entities and tag modifiers.

For example, consider the following sentences:

Annotate these sample sentences to indicate which entities correspond to the literals. For example:

To annotate a sample, you first need to select the relevant tokens in the sample. Note that a literal can potentially span multiple consecutive words, for example, "United States of America". Click on the first and last words for the literal. This highlights and brackets the span of words you want to label. It also opens a menu to select an entity label.

If you make a mistake and need to deselect and start again, simply click anywhere on the screen. Once you have finished selecting the relevant tokens, select the appropriate entity from the menu to apply the annotation.

To add AND, OR, or NOT tag modifiers to your annotation, first annotate the entities you want to modify. Then select the entities to modify by clicking the first annotation and then clicking the last annotation. Select Tag Modifier and the appropriate modifier from the menu.

For example, consider the sentence: "I want a cappuccino and a latte."

To annotate with the AND modifier, first annotate the sentence to indicate the entities for your literals:

I want a [CoffeeType]cappuccino[/] and a [CoffeeType]latte[/]

Next, click the annotation for cappuccino and then the annotation for latte. With both annotations selected, choose the AND modifier in the menu. The AND modifier is added:

I want a [AND][CoffeeType]cappuccino[/] and a [CoffeeType]latte[/][/]

For information on verifying the status of samples, see Verify samples.

Modify intents and annotations

Mix.nlu provides various ways to modify the intents and annotations that you have added.

Fix incorrect samples

If you make typos while adding samples, or if some samples were not transcribed correctly, you should fix them to make sure that they correspond to what users actually said. This builds a better model.

To fix an incorrect sample:

  1. Click the ellipsis icon ellipsis icon beside the sample that you want to edit and click Edit.
  2. Correct the text as appropriate.
  3. Click the checkmark to save your changes.

Edit or remove annotations

To change an entity that annotates a sample:

  1. Click the entity in the sample then click Remove.
  2. To choose a new entity, click the literal and choose a new entity.

Change intent

To assign a sample to a different intent, use the Move selected Samples dialog. When moving sample sentences, you can choose to also move or delete any annotations that you've made.

To assign a sample sentence to a different intent:

  1. Click the ellipsis icon ellipsis icon beside the sample and click Change Intent.
  2. In the Move selected Samples dialog, select an option for moving your selected sample: to use an existing intent, or to create a new one.
  3. Click Next.
  4. Either:
    • Choose an existing Intent: Choose another intent, NO_INTENT, or Unassigned Samples.
    • Create a new Intent: Enter a name for the new intent.
  5. Click Next.
  6. Choose to import or remove annotated entities. (This step is not available when moving intents to Unassigned Samples.)
  7. Click Next.
  8. To confirm, click Finish.

Use the check boxes (or the select-all check box above the list of samples) to move multiple samples.

Assign NO_INTENT

Sometimes an entity applies to more than one intent or, to look at it another way, an entity can mean different things depending on the dialog state. Rather than add this entity to multiple intents, it's best to use NO_INTENT.

Consider these two example interactions. The first one is in the context of booking a meeting.

User: Create a meeting
System: For when?
User: Tomorrow at 2

This second example is in the context of booking a flight.

User: Book flight to Paris
System: For when?
User: Tomorrow at 2

In each of these interactions, there is a clear intent in the user's first statement, but the second utterance on its own has no clear intent.

In this case, it's best to tag "Tomorrow at 2" as [nuance_CALENDARX]Tomorrow at 2[/] to cover both scenarios (and not as [MeetingTime]Tomorrow at 2[/] or [FlightDepartureTime]Tomorrow at 2[/]).

As shown in the examples, often these words or phrases are fragments and are used in a dialog as follow-up statements or queries.

Verify samples before training

Before generating models, verify your training sample data. This step involves reviewing each sample phrase or sentence for intents and entities and ensuring that they have been assigned the correct status. It also involves confirming which samples to include in the training set for the model, and which to exclude.

This process improves your model's accuracy.

Verification of the sample data needs to be carried out for each language in the model, and for each intent.

Open and view samples by language and intent

To get started, open up the set of sample sentences for the language and intent.

  1. Open the Develop tab.
  2. (For multi-locale projects) Select the locale from the menu near the name of the project.
  3. Click an intent to view the samples.

Overview of verification states

Samples can be in the following verification states:

Icon State Description
intent-assigned Intent-assigned A half-filled circle icon indicates that the sample has been assigned an intent.
For example, via .txt or TRSX file upload, by adding a sample using Try, or by manually adding a sample phrase or sentence to an intent in the Mix.nlu UI.
Sample may or may not be annotated.
Impact of this state on the model: Samples assigned this state will only be used to detect the intent. The data provided by this sample will not be used to detect the presence of Entities.
annotation-assigned Annotation-assigned A filled-circle icon indicates that the sample has been assigned an intent and annotation is complete.
Sample can be annotation-assigned via TRSX file upload or in the Mix.nlu UI.
Sample may or may not be annotated.
Impact of this state on the model: Samples assigned this state are used to detect the intent as well as any annotated entities. If such a sample contains a literal that appears in an entity but is not annotated, it will be used as a "counter example" for that entity; that is, it will lower the chance of such entity literals being detected.
excluded Excluded A "pause" icon indicates that the sample, although assigned an intent, is to be Excluded from the model.
Sample can be Excluded in the UI or via TRSX file upload.
Sample may or may not be annotated.
Impact of this state on the model: Samples assigned this state are Excluded.
Unverified (No icon.) Sample is assigned to Unassigned Samples, either via .txt or TRSX file upload or manually in the UI.
Sample contains no annotations and is Excluded from the model.
Impact of this state on the model: Samples assigned this state are Excluded.

Display status information

By default, status information is not displayed. To see the status information, click the Status visibility toggle.
verify_status_toggle Status icons will then appear to the left of the sample items (Or on the right for samples in right-to-left scripts).

Exclude or include samples

You can exclude a sample from your model without having to delete and then add it again. By default, new samples are included in the next model that you build. By excluding a sample, you specify that you do not want it to be used for training a new model. For example, you might want to exclude a sample from the model that does not yet fit the business requirements of your app.

To exclude a sample, click the ellipsis icon ellipsis icon beside the sample and then choose Exclude.


An excluded sample appears with gray diagonal bars and the status icon changes to indicate it is excluded.

You can still modify the excluded sample. Any annotations that were attached to the sample before it was excluded are saved in case you want to re-include it later.

To include a previously excluded sample, either use the ellipsis icon menu or click on the status icon. The sample is restored to its previous state with any previous intent and annotations restored.

Change the status of a sample

When you start annotating a sample assigned to an intent, its state automatically changes from Intent-assigned to Annotation-assigned. This signals to Mix.nlu that you intend to add the sample to your model(s). You can always choose to assign a different state to the sample; for example, to exclude it (change the state to Excluded) or to use it to detect intent only (change to Intent-assigned).

To change the status of a sample, hover over the status icon and click. This will allow you to change the state from Intent-assigned to Annotation-assigned or vice-versa.

Filter displayed samples by status

When there are a lot of samples for an intent, you may want to filter the displayed samples by status. To do this, open the drop-down menu next to the status visibility toggle to choose the status to display.


Bulk operations

To change the verification state of multiple samples at once, use the check boxes (or the select-all check box above the list of samples) to choose multiple samples. Select the appropriate icon from the row above the samples to include or exclude samples, assign them as Intent-assigned, or assign them as Annotation-assigned. You can also choose to remove the selected samples or move them to another intent.
The general idea is that bulk operations apply to all selected samples, but there are operation-specific particularities you should be aware of.

Operation Notes on behavior
Exclude Already excluded samples will stay as-is. Intent-assigned and Annotation-assigned samples will be excluded, but the previous state, including any assigned intent and annotations, will be remembered in case you want to re-include the sample.
Include Already included samples will stay as-is. Previously excluded samples will be re-included with the same verification state as they had before being excluded.
Intent-assigned Excluded samples are not impacted and stay excluded.
Annotation-assigned Excluded samples are not impacted and stay excluded.

Only visible samples can be selected for mass status change, that is, samples that have not been filtered from the view.

Notes

Train your model

Training is the process of building a model based on the data that you have provided.

If your project (or locale) contains no samples, you cannot train a model. You need at least one sample sentence that is either intent-assigned or annotation-assigned. Be sure to verify samples.

Developing a model is an iterative process that includes multiple training passes. For example, to retrain your model when you add or remove sample sentences, annotate samples, verify samples, include or exclude certain samples, and so on. When you change the training data, your model no longer reflects the most up-to-date data. As this happens, the model must be retrained to enable testing the changes, exposing errors and inconsistencies, and so on.

To train your model:

  1. In Mix.nlu, click the Develop tab.
  2. (As required) Select the locale from the menu near the name of the project.
  3. Click Train Model.

Mix.nlu trains your model. This may take some time if you have a large training set. A status message is displayed when your model is trained.

To view all status messages (notifications), open the Console panel Console panel icon.

Training a model that includes prebuilt domains

If you have imported one or more prebuilt domains, click the Train Model button to choose to include your own data and/or the prebuilt domains. Since some prebuilt domains are quite large and complex, you may not want to include them when training your model.

To train your model to include one or more domains:

  1. Click the arrow beside Train Model.
    The list of prebuilt domains is displayed in addition to your own data.
    In the example below, the Nuance TV and Nuance Weather prebuilt domains have been imported into the project:
  2. Check the domains you want to include.
  3. Check My data to include your data.
  4. Click Train Model.

Test it

After you train your model, test it interactively in the Try panel. Use testing to tune your model so your client application can better understand its users.

If you are unsatisfied with the result, you can manually assign an intent or annotate your sentence by adding it to your project as a new sample.

The Results area shows the interpretation with the highest confidence. Information from the NLU engine, including all interpretations, appears on the right formatted as a JSON object. For more information on the fields in an interpretation, see the reference section: Interpretation.

To test your model:

  1. In Mix.nlu, click the Develop tab.
  2. (As required) Select the locale from the menu near the name of the project.
  3. Click Try. The Try panel appears.
  4. Enter a sentence your users might say and press Enter.
  5. If the intent is incorrect, click Add Sample and then change the intent.
  6. If the annotated entities are incorrect, click Add Sample and then edit the annotation.

The Try panel presents the response from the NLU engine. The Results area displays the orderCoffee intent with a confidence score of 1.00.

The Results area reflects only the changes you have made to intents and entities since the last time you trained a model. No annotation appears in the Results area if the NLU engine cannot use your model to interpret the entities in your sample. Also, there is no annotation for dynamic list entities. Only your client application provides this information at runtime.

Discover what your users say

Now that your model is ready, you can look at the sentences that people speak or type while using your application. These sentences appear in the Discover tab. You’ll review them there, then add the ones you want directly into your intents on the Develop tab to improve and grow your model.

To access Discover tab information for a project and application configuration:

  1. From the Mix Dashboard, select a project with a deployed application configuration.
  2. Click the .nlu icon.
  3. Select the Discover tab.
  4. Select the application, associated context tag, and environment.
  5. Click Apply Filters.

Within the Discover tab, you can view information on speech or text input from application users. The information is presented in tabular format, with the following details available for each entry.

Column Description
Intent The intent identified by the model for the user input. If the model determines that the sentence does not seem to fit any of the expected intents, it will show NO_MATCH. NO_MATCH cases can help you identify intents that were not considered before but which are important to users. These can be added to refine and improve the model.
Samples The content of the user input, as text. The sample may include annotations attached by the model if (1) the model identified an intent, (2) the identified intent has entities defined, and (3) the model confidently identified and picked out entity values from the sentence.
Score The model’s level of confidence in the inferred intent, as a decimal between 0.00 and 1.00.
Collected on Date and time the input was collected.
Region Deployment region where the user interaction occurred.

If there is a lot of user data, the data is presented in pages.

You can sort the data by column. Click on the column title to sort. By default, the data is sorted on the "Collected on" column. Clicking on a column header a second time will sort on that column in the opposite order.

While it is not currently possible to directly transfer the data from the Discover tab into the model, you can download the data in the table as a .csv file. To do this, click on the download icon ![download-data](mix-nlu/discover_download_icon.png) above the table. You can then process the data externally into a format that can be imported into Mix.nlu. For more information about importing data into a model, see [Importing and exporting data](../import-project-data/#importing-and-exporting-data)

Using the insights gained from the Discover tab, you can go back to the Develop tab to add new intents, entities, and samples, go to the Build tab to redeploy your updated model, and finally view the data from your newest model on the Discover tab. Rinse and repeat! You can improve your model (and your application) over time using an iterative feedback loop.

Mix.nlu Console panel

The Console panel displays Mix.nlu notifications.

Display the Console panel

To display the Console panel, click the Console panel icon Console panel icon.

To hide it, click the icon again.

View notifications

The Notifications tab displays notifications for your projects. From this tab you can:

Ontology

In natural language understanding, an ontology is a formal definition of entities, ideas, events, and the relationships between them, for some knowledge area or domain. The existence of an ontology enables mapping natural language utterances to precise intended meanings within that domain.

In the context of Mix.nlu, an ontology refers to the schema of intents, entities, and their relationships that you specify and that are used when annotating your samples, and interpreting user queries.

Intents

An intent identifies an intended action. For example, an utterance or query spoken by a user expresses an intent to order a drink. As you develop an NLU model, you define intents based on what you want your users to be able to do in your application. You then link intents to functions or methods in your client application logic.

Here are some examples of intents you might define:

Intents are often associated with entities to further specify particulars about the intended action.

Entities

An entity is a language construct for a property, or particular detail, related to the user's intent. For example, if the user's intent is to order an espresso drink, entities might include [CoffeeType], [Flavor], [Temperature], and so on. You can link entities and their values to the parameters of the functions and methods in your client application logic.

If an entity applies to a particular intent, it is referred to as a relevant entity for that intent. The idea of relevant entities is important:

Mix.nlu supports the following entity types:

List entities

A list entity has possible values that can be enumerated in a list. For example, if you have defined an intent called orderCoffee, the entity [CoffeeType] would have a list of drink types that can be ordered. Other list types entities might include song titles, states of a light bulb (on or off), names of people, names of cities, and so on.

A literal is the range of tokens in a user's utterance or query that corresponds to a certain entity. The literal is the exact spoken text. For example, in the query "I'd like a large t-shirt", the literal corresponding to the entity [Shirtsize] is "large". Other literals might be "small", "medium", "large", "big", and "extra large". When you annotate samples, you select a range of text to tag with an entity. For list-type entities, you can then add the text to the list for the entity. Lists of literals can also be uploaded in .list or .nmlist files. For more information, see Importing entity literals.

Literals can be paired with values. For example, "small", "medium", and "large" can be paired with the values "S", "M", and "L", respectively. Multiple literals can have the same value, which makes it easy to map different ways a user might say an entity into a single common form. For example, "large", "big", "very big" could all be given the same value "L".

Defining literal-value pairs per language

If your project includes multiple languages, you will want to support the various ways that users might ask for an item in their language of choice. List-based entities created in a project are shared across languages. The values and associated literals connected to the entity, however, are created and managed separately by language. This gives flexibility to handle situations where the value options vary by language and location.

When you add a value-literal pair, this pair will apply to the entity only in the currently selected language. The same value name can be used in multiple languages for the same list-based entity, but the value and its literals need to be added separately in each language.

To add a new value and a literal for a list-based entity within the currently selected language, enter the literal and value in the Entity list pane where indicated and then click the plus (+) icon. The new value appears in the list along with the first literal. You can also click there to add new literals that map to the same entity value. Again, the literal-value pairs added will not be automatically added to the other languages in the project.

To remove a literal, click the delete icon close-icon next to the literal. You are asked to confirm the deletion. This removes the literal from the currently selected language.

entity-edit-literal

Dynamic list entities

It is not always feasible to know all possible literals when you create a model, and you may need the ability to interpret values at runtime. For example, each user will have a different set of contacts on his or her phone. It is not practical (or doable) to add every possible set of contact names to your entity when you are building your model in Mix.nlu.

Dynamic list entities allow you to upload data dynamically at runtime and use this data to provide personalization and to improve recognition and natural language understanding accuracy.

Defining dynamic entities

To define a list entity as dynamic, check the Dynamic box for this entity.

While dynamic data is uploaded at runtime, it is still important to define a representative subset of literal and value pairs for dynamic list entities. This ensures that the model is trained properly and improves the accuracy of the ASR. Using our contact example, this means that you should include a representative subset of what you expect contact names to look like, and ensure that you have samples with the proper annotation.

When naming your dynamic entities in each model, keep in mind that they are global per application ID (across languages and deployed model versions).

Freeform entities

A freeform entity is used to capture user input that you cannot enumerate in a list. For example, a text message body could be any sequence of words of any length. In the query "send a message to Adam hey I'm going to be ten minutes late", the phrase "hey I'm going to be ten minutes late" becomes associated with the freeform entity [MessageBody].

Having difficulty determining which type to use? See the examples below.

Example sports application – List type

Consider a sports application, where your samples would include many ways of referring to one sports team, for example, the Montreal Canadiens:

Since you could enumerate each option, you would make this a list type and annotate it accordingly. Additionally, the NLU engine would learn about the entity from these different ways of referring to the Canadiens. You would not have to enumerate every possible sports team or every possible way to refer to the Canadiens.

Example SMS app – Freeform type

When your sample includes text that does not have well-defined many-to-one relationships and that cannot be fully enumerated, use the freeform entity type. Consider an SMS app, where it is impossible to list every way that a user may say something to your app. The body of an SMS message could be literally anything. Here is an example of what those annotations might look like:

[MessageBody] would be a freeform entity because it is unpredictable and cannot be fully enumerated.

Be aware that any words inside the freeform tag do not improve your NLU model. The text marked as the freeform part of the sample (and only that part!) is like a black box that won’t be further analyzed. Additionally, the ASR engine won’t be able to improve the recognition of these words as it would be able to do for words in a list type. Use the freeform type with care.

Best practice

Be careful not to overuse the freeform entity, especially when an entity has many options but already has large base grammars, like SONGS or CITIES. It is not recommended to use the freeform entity for items like these because they already have a huge number of predefined values that the NLU engine has been trained on.

Relationship entities: isA and hasA

A relationship entity has a specific relationship to an existing entity, either the "isA" or "hasA" relationship.

An isA relationship states that [ENTITY_X] is a type of [ENTITY_Y]. The definition of Y is inherited by X. For example, the list of literals, grammars, or relationships. Note that while the definition of the child entity is the same as the parent entity, the child entity picks up differences because of its different role in your samples.

For example, say you have a train schedule app and you want to accept queries such as "When is the next train from Boston to New York." Both "Boston" and "New York" are instances of the [Station] entity. If you annotated the query using [Station] for both cases, then you would have no way of determining which is the origin and which is the destination. To resolve this, you could instead define two list-type entities, [FromStation] and [ToStation], and associate each with the same list of literals. This would, of course, be time consuming and difficult to manage. The better solution is to define one list-type entity [Station] with an associated list of cities/stations, and then define [FromStation] isA [Station], and [ToStation] isA [Station]. Now, you only have one list of stations to manage. The model interprets queries and returns [FromStation] or [ToStation] as appropriate for the roles they play in the query, and returns literals and values from the list associated with the [Station] entity.

You can also make isA relationships to predefined entities. For example, [Age] is a [nuance_CARDINAL_NUMBER].

A hasA relationship states that [Entity_Y] is a property or a part of [Entity_X]. That is, X has a Y. For example, the entity [FullName] might have the entities [GivenName] and [FamilyName] as part of it. The entity [Drink] might have [CoffeeType] and [Size] as part of it. Note that unlike an isA relationship, an entity can have multiple hasA relationships.

You would use hasA relationships if the entities in your queries have structure. However, Nuance recommends that you use hasA relationships only if you have a definite need, since they can be tricky to work with, and the complexity means the NLU models may be less accurate than desired. An example of a definite need is to be able to interpret a query like "put the red block into the green box". In this case you need a way to associate the color red with the block and the color green with the box. Without using hasA relationships the JSON object returned would be flat and you would not know which color went with which object. Using hasA, you would define an [Object] that has a [Color] and [Shape]. Then the following annotation becomes possible: "put the [Object][Color]red[/][Shape]block[/][/] into the [Object][Color]green[/][Shape]box[/][/]".

Anaphoras

An anaphora is defined as "the use of a word referring back to a word used earlier in a text or conversation, to avoid repetition" (from Lexico/Oxford dictionary).

An anaphora often occurs in dialogs and makes it difficult to understand what the user means. For example, consider the following phrases:

In this example, "there" is an anaphora for "Montreal".

In this example, "him" is an anaphora for "Bob".

An ellipsis (intent anaphora) occurs when a user references an intent that was identified in a previous request. The dialog recognizes when the wording of the new request refers to the intent of the previous request, including its entities. For example: * User: “What is the weather in Boston this weekend?” * System: “This weekend in Boston the weather will be …” * User: “What about Montreal?” * The system understands the intent is to find the weather and includes the entity weekend: “This weekend in Montreal, the weather will be …” Note: Ellipsis are supported in the context of the most recent intent; the system cannot recognize previous intents.

Tagging anaphoras

In Mix.nlu, you can:

This will help your dialog application determine to which entity the anaphora refers, based on the data it has, and internally replace the anaphora with the value to which it refers. For example, "Drive there" would be interpreted as "Drive to Montreal".

The four types of anaphora entities are:

To identify an entity as referable:

To use anaphoras in annotations: **Step 1: Identify an entity as referable**
  1. In the Entities area of the Develop tab, select the entity.
  2. In the Referenced as field, select the correct anaphora type for this entity.
    For example, for a contact, select REF_PERSON:
  3. Train your model.

Tag modifiers

A tag modifier is an entity that modifies other entities by adding a logical operator: AND, OR, or NOT. You specify tag modifiers by annotating samples.

Your Mix.nlu model can use the AND and OR modifiers to connect multiple entities. It can use the NOT modifier to negate the meaning of an entity.

For example, "a cappuccino and a latte" would be annotated as [AND][CoffeeType]cappuccino[/] and a [CoffeeType]latte[/][/]. The AND modifier applies to the two CoffeeType annotations.

The literal "no cinnamon" would be annotated as no[NOT][SprinkleType]cinnamon[/][/]. The NOT modifier applies to the SprinkleType annotation.

Note how the literals "and" and "no" are not annotated as an entity or tag modifier. Instead, tag modifiers are the parents of the annotations that they connect or negate.

Regex-based

A regex-based entity defines a set of values using regular expressions. For example, product or order values are typically alphanumeric sequences with a regular format, such as gro-456 or ABC 967. Both of these examples, and many more codes with the same general pattern, can be described with the regex pattern:
[A-Za-z]{3}\s?-?\s?[0-9]{3}

Similarly, you might use regex-based entities to match account numbers, postal (zip) codes, confirmation codes, PINs, or driver's license numbers, and other pattern-based formats.

Creating regex-based entities

To use a regular expression to validate the value of an entity (for example, an order number as shown below), enter the expression as valid JavaScript.

In this example the user is creating a regex-based entity called OrderNumber, which will match order numbers in the form gro-456, COF-123, sla 889, and so on (three characters + an optional hyphen and/or space + three digits).

To save the pattern, click Download project and save regex-based entity.

Before the entity-type is created (or modified), Mix.nlu exports your existing NLU model to a ZIP file containing a TRSX file so that you have a backup. Creating (or modifying) a regex-based entity requires your NLU model to be re-tokenized, which may take some time and impact your existing annotations. You receive a message when the entity is saved successfully.

Mix.nlu validates the search pattern as you enter it and alerts you if it is invalid. Invalid expressions (including empty values) are not saved.

Notes and cautions

Note the following points when creating regular expressions in regex-based entities:

Capture groups

Be careful when using parentheses in a regular expression, for example to quantify a sub-pattern with +, *, ?, or {m,n}. Enclosing in parentheses creates a capture group. In general programming, matching a regex pattern with capture groups on a string returns both the full pattern, and the individual capture groups, in order, packaged as an array.

With Mix.nlu specifically, however, an entity expects a single value. When you use a regex with capture groups, Mix.nlu will return the result from the first capture group only rather than the full pattern. This is to allow extra flexibility for developers; for example if you want to recognize a date pattern, but only need the month to fulfill the user's intent. If you need to use a parenthetical group, but want the full pattern match as the value returned for the entity, there are two options:

Anchors

Nuance does not recommend using a caret (^) to denote the beginning of a regular expression, or a dollar sign ($) to denote the end, as doing so will cause the NLU engine to expect the expression at the beginning, or end, of a sentence. Consider this phone number regex-based entity (any phone number of format 123-456-7890):

Annotating with regex-based entities

Annotating with regex-based entities means identifying the tokens to be captured by the regex-defined value. At runtime the model tries to match user words with the regular expression.

For example:

What's the status of order [OrderNumber]COF-123[/]

Predefined entities

Mix.nlu includes a set of predefined entities that can be useful as you develop your own NLU models. Predefined entities save you the trouble of defining entities that are generally useful in a number of different applications, such as monetary amounts, Boolean values, calendar items (dates, times, or both), cardinal and ordinal numbers, and so on.

A predefined entity is not limited to a flat list of values, but instead can contain a complete grammar that defines the various ways that values for that entity can be expressed. A grammar is a compact way of expressing a vast range of possible constructions.

For example, within the nuance_DURATION entity, there is a grammar that defines expressions such as "3.5 hours", "25 mins", "for 33 minutes and 19 seconds", and so on. It would simply not make sense to try to capture the possible expressions for this entity in a list.

Some notes:

For more information, including on specific predefined entities, see Predefined entities.

Dialog predefined entities

Mix.nlu adds a default set of entities to simplify your Mix.dialog applications. These dialog entities are isA entities that refer to predefined entities. Dialog entities have shorter, more descriptive names than predefined entities. This can make it easier to develop and maintain your Mix.dialog application while taking advantage of the convenience of predefined entities.

For example, DATE is a dialog predefined entity that is defined as an isA entity for nuance_CALENDARX. If your Mix.dialog application processes dates, use the DATE entity instead of nuance_CALENDARX.

Like the predefined entities prefaced with nuance_, you cannot rename dialog predefined entities, delete them, or edit them.

Dialog entities appear in the Predefined Entities section of the Entities area. Mix adds them when you create your project.

This table briefly describes the purpose of each dialog predefined entity.

Dialog entity isA predefined entity Description
DATE nuance_CALENDARX Calendar date
TIME nuance_CALENDARX Time of day
YES_NO nuance_BOOLEAN Yes or no

Note: The following dialog entities are deprecated and, therefore, may appear in the Custom Entities list. These dialog entities can be edited, renamed, and deleted.

Dialog entity isA predefined entity Description
CC_EXP_DATE nuance_EXPIRY_DATE Credit card expiry date
CREDIT_CARD nuance_CARDINAL_NUMBER Credit card number
CURRENCY nuance_AMOUNT Monetary amount
DIGITS nuance_CARDINAL_NUMBER String of digits
NATURAL_NUMBER nuance_CARDINAL_NUMBER Round number with no decimal point
PHONE nuance_CARDINAL_NUMBER Telephone number
SSN nuance_CARDINAL_NUMBER Social Security Number
ZIP_CODE nuance_CARDINAL_NUMBER Postal zip code

Language support

The Nuance Mix Platform offers a growing number of languages. To determine the languages (locales) available to your project, go to the Mix.Dashboard, select your project, and click the Targets tab. For more information, see Build resources.

For the complete list of supported languages, see Languages.

Change log

2020-09-03

Update to Verify samples to enable bulk operations changing the verification state of multiple samples at the same time.

2020-09-02

Adding new Discover tab. The Mix.nlu Discover tab allows you to see what users are saying to your deployed application, giving you the opportunity to refine your NLU models based on actual data. For now the data is read-only; additional functionality will be added in future releases, such as ability to export data, assign intents, annotate the data, and add selected samples to your training set.

2020-08-30

Update and refactoring of Modify samples and Verify samples sections to reflect updates to the UI of the Develop tab samples view and changes in functionality.

2020-08-11

2020-07-17

Added additional information to Verify samples to explain the impact of the new "intent verified" and "fully verified" states.

Note that action is required to approve (fully verify) entity annotations. This crucial step ensures that models are built with the correct data.

2020-07-14

2020-06-11

2020-05-04

Updated screenshots.

2020-03-31

2020-02-19

2020-01-22

Updated predefined entities section.

2019-12-18

2019-12-02

Updated occurrences of the term "concept" with "entity."

2019-11-15

Below are changes made to the Mix.nlu documentation since the initial Beta release: