Categorization rules
As part of transcribing recordings, Conversation Analyzer categorizes the textual contents of the transcript, by identifying key phrases based on the defined rules, and recording the subcategory or category those rules belong to. A category is a collection of subcategories, which in turn contain a series of rules. Each rule consists of a word or phrase and the party who said that word or phrase. If the transcript contains the word or phrase and was spoken by the specified party, Conversation Analyzer matches it against the category.
For example, you may want to track how polite your agents are when speaking with customers. Create a category of 'Politeness' that contains subcategories with rules that look for phrases such as 'Please', 'Thank you' and 'You're welcome'. You may also want to ensure that agents are promoting a new product or service. You would need to create a category for the product or service with subcategories identifying incidences of the agent using terms relating to the product or service. For information on how to create a categorization rule, see Managing categorization rules.
Categorization expression language
The categorization expression language describes the required format of the values you provide in the Expression and Find fields in Category Editor when creating categorization and substitution rules. Conversation Analyzer can then use these values to locate matching text in the transcripts. For more information, see Managing categorization rules and Managing substitution rules.
Expression and Find value validation
Valid Expression and Find field values contain only alphanumeric, apostrophe and space characters; that is, values can contain spaces (U+0020), apostrophes (U+0027), and characters from the following Unicode categories:
Unicode Category Name | Description |
---|---|
Ll | Letter, Lowercase. For example, a-z, ᵯ, ḅ, ṥ, ở, ﬓ |
Lu | Letter, Uppercase. For example, A-Z, Ý, Ŧ, Ǣ, Щ, 𝕐 |
Lt | Letter, Titlecase. For example, Dž, ᾎ, ᾟ, ᾭ |
Lo | Letter, Other (e.g. ª, ܗ, 爨) The Mongolian Letter "Manchu Ali Gali Lha" (U+18AA,) is not allowed within expression and find values. This character is used internally within the categorisation engine. If the character appears within spoken text, Conversation Analyzer treats the character as an apostrophe. |
Lm | Letter, Modifier. For example, ʰ, ᵓ, 〲, ꟹ |
Mn | Mark, Nonspacing. For example, ុ, ᜴ |
Nd | Number, Decimal Digit. For example, 0-9, ۳, ૮, ๗ |
Pc | Punctuation, Connector. For example, _, ‿, ⁀, ⁔, ︳, ︴, ﹍, ﹎, ﹏, _ This category includes ten characters; the most commonly used is the LOW LINE character (_), u+005F. |
Values can be no more than 100 characters long.
Replace by value validation
Values can be no more than 64 characters long.
Wildcards in values
The categorization expression language supports the following wildcards within the values. Examples refer to the Expression field you fill in when creating categorization rules, but exactly the same rules apply to the Search phrase field in substitution rules.
Wildcard | Description | Example expressions | Details |
---|---|---|---|
? | Wildcard representing one character | Each ? represents one character. | |
wh? | The following words will match the example expression: "who" and "why". For an example of an expression using the | ||
wh?? | The following words will match the example expression: "what", "when", "whom". For an example of an expression using the ?? wildcard, see Example 5. Expression using the ?? wildcard. | ||
* | Wildcard representing zero to many characters | sit* | The following words will match the example expression: "sit", "sits", "sitting". For an example of an expression using the * wildcard, see Example 3. Expression using the * character wildcard. To use You can also use * to represent a word or words. For information, see Wildcard representing zero to many words. |
# | Wildcard representing one numeric character | ### | Only digits will match the example expression, not text. Text containing "123" will match the example expression but text containing "one two three" will not. For an example of an expression using the |
* | Wildcard representing zero to many words | cat * mat | The following phrases will match the example expression: "cat mat", "cat sits on the mat", and "cat always sits happily on the mat". For an example of an expression using the To use You can also use * to represent a character or characters. For information, see Wildcard representing zero to many characters. |
Words between value
The Words between field is available when creating categorization rules or substitution rules. It represents the number of words that can appear between the specified words in a phrase. If set to a value different than 0, the ~N expression appears at the end of the rule name in the profile tree.
If the expression contains more than two words, the Words between value applies to the number of words between any of the specified words.
See below for examples.
Expression examples
Example 1. Simple expression
Expression: the cat sat
With a simple expression, only the exact word or phrase will satisfy the rule.
Example 2. Expression using the ?
character wildcard
Expression: the cat? sat
The ?
in the expression represents a single character that must appear after "cat" but before "sat" in matching text.
Text | Does it match? | Explanation |
---|---|---|
the cat sat | No | The ? in the expression requires a character in its place. |
the cats sat | Yes | The |
their cats sat | No | The expression does not allow any additional characters after "the". |
Example 3. Expression using the *
character wildcard
Expression: sit*
The *
in the expression represents zero to many characters that can appear after "sit" in matching text.
Text | Does it match? | Explanation |
---|---|---|
sit | Yes | The * in the expression requires zero to many characters in its place. |
sits | Yes | The |
sitting | Yes | The |
sat | No | The expression requires that "sit" appears in the text. |
Example 4. Expression using the #
character wildcard
Expression: ### ###
Matching text must contain two sets of three digits, separated by a non-word character and no other characters.
Text | Does it match? | Explanation |
---|---|---|
123 456 | Yes | The expression matches two sets of three digits. |
123-456 | Yes | The expression matches two sets of three digits. The hyphen is a non-word character and separates the two sets of three digits. |
123456 | No | The expression requires two sets of three digits, not one set of six. |
123 abc 456 | No | The expression requires two consecutive sets of three digits, not two sets separated by any other characters. |
Example 5. Expression using the ??
wildcard
Expression: wh?? cat
The ??
in the expression represents two characters must appear after "wh" and before "cat" in matching text.
Text | Does it match? | Why |
---|---|---|
what cat | Yes | The |
when cat | Yes | The ?? in the expression represents the "en" in the text. |
who cat | No | The ?? in the expression requires two characters after "wh" not one. |
which cat | No | The |
Example 6. Expression using the *
word wildcard
Expression: the cat sits * on the mat
The text must contain the phrase "the cat sits on the mat" with zero to many words between "sits" and "on".
Text | Does it match? | Why |
---|---|---|
the cat sits on the mat | Yes | The |
the cat sits happily on the mat | Yes | The |
the cat always sits on the mat | No | The |
Example 7. Expression using the Words between
field
Expression: cat mat
Words between: 3
The text must contain the words "cat" and "mat" with up to three words between them.
Text | Does it match? | Why |
---|---|---|
the cat mat | Yes | The text contains no words between "cat" and "mat" and the expression allows up to three. |
the cat likes mat | Yes | The text contains one word between "cat" and "mat", and the expression allows up to three. |
the cat sits on the mat | Yes | The text contains three words between "cat" and "mat", and the expression allows up to three. |
the cat always sits happily on the mat | No | The text contains five words between "cat" and "mat", but the expression only allows up to three. |
Example 8. Expression using the Words between
field
Expression: cat sat mat
Words between: 3
The text must contain the words "cat", "sat" and "mat" with up to three words between each of them. In this example, matching text may contain three words between "cat" and "sat" and also three words between "sat" and "mat".
Text | Does it match? | Why |
---|---|---|
the cat eagerly sat on the mat | Yes | The text contains one word between "cat" and "sat", and two words between "sat" and "mat"; the expression allows up to three. |
the cat eagerly and promptly sat on the green mat | Yes | The text contains three words between "cat" and "sat", and three words between "sat" and "mat"; the expression allows up to three. |
the cat sat on the green and blue mat | No | The text contains too many words (five) between "sat" and "mat". |
For general assistance, please contact Customer Support.
For help using this documentation, please send an email to docs_feedback@vonage.com. We're happy to hear from you. Your contribution helps everyone at Vonage! Please include the name of the page in your email.