# NLP best practices

<details>

<summary>⭐️ Best practices to create good intents</summary>

Some tips to create the best intents:

* **Use intents that you already have**

Have customer data like FAQs? Analyze it to see which questions are most common and important, then program your bot to handle those. Calculate each question's volume to prioritize them. Unsure what users need? Start with a simple click-bot to gather data, then use that info to define your bot's intents.

* **Start small**

Start small. If your bot can handle 5 out of 15 questions but covers 80% of queries, that's a great start. This allows your team to address the remaining 20%. Add new intents based on user data and let your bot grow naturally. It's better to start small and excel than to do too much and fail.

* **Balance the number of expressions per intent**

It’s important to have about the same number of expressions per intent to make sure that the bot doesn't train more on intents with a large expression count, ignoring the ones with a less expressions.

* **Revise and optimize**

Start with a general intent to trigger a basic flow, then add follow-up questions to understand user needs better, allowing you to refine intents later.

For example, in a telco support bot:

1. Issue with phone
2. Issue with wifi

Each intent can cover multiple issues (battery, screen, software, lost order for phones; connection types for wifi). Use follow-up questions to specify the problem (e.g., phone model, modem type). Over time, analyze user messages: if users often specify phone models but not wifi issues, create more intents or use entities for phones while keeping wifi intents broad. Creating intents is an ongoing, iterative process.

* **Avoid conflict**

When intents are very similar, merge them to avoid confusion. For example, if you have intents for booking train and bus tickets, merge them into one 'booking tickets' intent, and differentiate by the transportation mode entity.

</details>

<details>

<summary>⭐️ Best practices to create good expressions</summary>

Creating a good set of expressions is key to create a smart bot. The accuracy of your bot stands or falls with the quality of your expressions, so make sure to spend enough time on this, as well as reviewing them regularly.

Here are some tips & tricks for creating good expressions:

* **Use diverse expressions in terms of vocabulary and structure**

For more information, read our [dedicated article](https://docs.chatlayer.ai/nlp/natural-language-processing-nlp/how-to-nlp/creating-diverse-expressions).

* **Use real live data**

Chances are there are already a lot of user expressions which you can feed to your bot. Think customer support logs, social media posts, comments on your company's forum etc.

* **Use pre-built intents**

No need to reinvent the wheel when you can download the wheel directly on the Chatlayer platform! We have a lot of [pre-built intents](https://docs.chatlayer.ai/navigation/natural-language-processing-nlp/intents#from-prebuilt-intents) ready for you to use. Simply download them, train the NLP, and you're good to go!

* **Be specific**

Expressions must match a specific intent. For **change\_address**, phrases like *I have a question* are too vague. For **forgot\_password**, *I forgot it* is insufficiently specific. Be clear and precise.

* **Avoid filler words**

Avoid adding the expression *hello, I want to book a train ticket. Can you help me with that?* *Thanks*, because this sentence contains too many irrelevant words. Simply use *I want to book a train ticket* which is shorter and more relevant.

* **Use real language**

Add words and sentences to your bot which a real person would use in this conversation. Don’t use entire paragraphs or language which is overly formal. Keep it light and natural instead. Make use of real user messages in case you have them; data is knowledge.

* **Allow for slang and dialect**

Feel free to use slang words, common abbreviations (e.g. *asap* instead of *as soon as possible*) and regional dialects. Don’t overdo it though: only stick to things the majority of people would actually use.

* **Create enough expressions**

To achieve optimal bot performance, ensure each intent has 40 to 50 expressions. For excellent behavior, aim for 200 to 400 expressions per intent. Regularly review your user data and incorporate user-provided expressions to continually improve your model’s accuracy.

* **Keep the number of expressions balanced**

Ensure a balanced number of expressions per intent. If one intent has 100 expressions and another only 10, the model will more often match user messages to the intent with 100 expressions, causing overtriggering. Inaccurate matches happen because the model learns better from the intent with more data.

* **Use correct spelling**

Ensure each word in the training data is correctly spelled. The engine maps words to numeric formats but only for a predefined 200,000-word vocabulary. Misspelled words can lead to incorrect interpretations, like *pone* being corrected to *pony* or *phone*. Verify spelling to ensure your bot accurately learns relevant meanings.

* **Lower case vs UPPER CASE**

Users often do not use capitalisation when chatting with a bot. However, for intent classification, capitalisation is ignored, so you do not have to worry about it. But be careful: capitalisation is relevant for entity extraction.

* **No need for punctuation (or accents)**

Punctuation and accents are ignored by our NLP, so don't worry about adding them. For instance, *élève* is treated the same as *eleve*. &#x20;

</details>

{% content-ref url="how-to-nlp/creating-diverse-expressions" %}
[creating-diverse-expressions](https://docs.chatlayer.ai/nlp/natural-language-processing-nlp/how-to-nlp/creating-diverse-expressions)
{% endcontent-ref %}

<details>

<summary>⭐️ Best practices to create good entities</summary>

[Entities](https://docs.chatlayer.ai/navigation/natural-language-processing-nlp/synonym-entities) should only be used if their value is needed in the bot flow.

When adding entities to your training data, take the following things into account:

* **Punctuation**

Do not include any punctuation like '.' or '?' in your entity. '-' is ok, as it is often part of the entity, as in *Sint-Niklaas*.

* **Capitalisation**

The entity extraction models are not case sensitive. So there is no need to add both *Brussels* and *brussels*.

* **Words, not sentences**

Entities are a word or small number of words, usually noun phrases. Never mark full sentences or bigger phrases as an entity. In case users often use paraphrases instead of a word, which frequently happens with more technical terms, such as *the little box that I use in order to have internet everywhere in my house* instead of *wifi extender*, consider not using entities but a separate intent.&#x20;

* **Display entities in expressions**

We recommends adding at least 30 expressions per entity, to guarantee the quality of the entity detection

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.chatlayer.ai/nlp/natural-language-processing-nlp/how-to-nlp.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
