Search
Generator supports the ability to perform searches across your content library. Lexical and semantic text searches can help users find relevant learning content within a library or specific locations within a course that focus on a given search term. Field searches can be performed to find specific words or phrases in the generated metadata for the content in your content library.
Lexical Search
Lexical search relies on matching lexemes, which can be thought of as the “root” of a particular word. This is similar to an “exact match” search, although there is no guarantee that the lexemes for a given search term are, in fact, an exact match. For example, the words jumps, jumping, and jumped all share the common lexeme jump, and all three words would be considered matches when performing a lexical search with the term jumping. This approach is relatively simple and fast, but it has limitations. For example, it may not be able to handle misspellings, and it does not take into account the context or meaning of the words, which can lead to irrelevant results.
When performing a lexical search with a multi-word search term, results can vary depending on whether or not the order of the lexemes is considered. Lexical search supports two modes: Keyword and Phrase searching -
- Keyword: When searching for multiple words that are not surrounded by quotation marks, Generator will search for locations where all lexemes appear - without consideration of the order of lexemes. Searching for
carrot cakewill return matched text locations that have thecarrotandcakelexemes. - Phrase: When searching for multiple words that are surrounded by quotation marks, Generator will perform a phrase search. Here, Generator will search for locations where all lexemes appear while taking into consideration the lexeme order. Searching for
carrot cakewill consider text locations a match if the text has thecarrotlexeme followed immediately by acakelexeme.
Special Considerations for Stop Words
Some words have special handling when included in a lexical search. Stop words are common “filler” words, such as as, in, and on. These words are considered unimportant to the overall search because they don’t carry significant meaning. When performing lexical searches, Generator will remove these words before querying the content text, although the handling of such removals is different depending on what type of lexical search is being performed:
- When a keyword search is performed with the term
carrot in cake, the ‘in’ lexeme is ignored. The keyword search results forcarrot cakeandcarrot in cakeare the same. - When a phrase search is performed, the stop word is treated as interchangeable with other stop words. Performing a search with the term
"carrot in cake"asks Generator to find all text with the lexemecarrotfollowed by any stop word followed by the lexemecake. For example, results from searching the phrase"carrot in cake"will include both"carrots in cake"and"carrot on cake".
Match Property
Generator can include the list of matched terms when performing a search and setting the return_match query parameter to True. When performing keyword lexical searches, Generator will include all matched terms in the result match list. When performing a phrase lexical search, Generator will build a “match regex string” for the match value. This regex is flexible enough to account for the interchangeability of ‘stop words’ in the search results as well as handling of contractions, which are sometimes split apart to perform the text search.
Semantic Search
Semantic search, on the other hand, is a little bit different; the underlying meaning of the search term or phrase is considered when performing the search. The results of a semantic search will be based on their relevance to the search term.
Search Results
Once the search has been performed, Generator will return a list of search results. Each entry in the response list will include the following information:
- tenant_id: the tenant identifier
- content_id: the learning content identifier
- version: the learning content’s version
In certain situations, other properties will be included in the search results:
- location: a string representation of the search-term match location within the course. The value of this property is reflected differently according to the specific type or author of learning content. For example, for an
mp3course, the ’location’ value would reflect a timestamp, whereas for packaged content, the ’location’ value could reflect the name or navigation to a specific chapter or slide. This property is only included when performing lexical or semantic searches within a specific course. The ’location’ property is not included in tenant-wide searches or field searches. - match_strength: a float value between 0 and 1 that represents the match strength, where a value of 1 represents a perfect match. This property is only included when performing a semantic search.
Field Search
Both lexical and semantic search work by finding search_term references in the parsed text for a course. However, users may want to search for specific terms across the generated metadata for their content library. Field search performs searches for terms in the generated metadata field values of your content library. This can be used, for example, to find all content that has a title containing a specific word or phrase.
Field search is an exact match search, so alternative verb tenses, spelling mistakes, and the underlying meaning of the search term is not considered. However, field search is a case-insensitive search.
Field search results will include information about the content and field that is considered a match. For example, searching across the content in your tenant for the term carrot may include a response schema similar to this:
{
"content_id": "course1",
"version": 1,
"tenant_id": "default",
"field": {
"create_dt": "2025-01-28T20:00:00.000000",
"update_dt": "2025-01-29T21:00:00.000000",
"field_type": "SUMMARY",
"value": "This course explores the user's deep appreciation for carrot cake. Learners will gain insight into the user's personal food preferences and the joy they find in sharing their favorite dessert with others."
}
},
{
"content_id": "course2",
"version": 1,
"tenant_id": "default",
"field": {
"create_dt": "2025-01-30T20:00:00.000000",
"update_dt": "2025-01-31T21:00:00.000000",
"field_type": "TITLE",
"value": "A Carrot Cake Craving"
}
},
Note field search results do not include the location property, as field searches are not performed against a course’s parsed text.
Search Scopes
When performing a search, it may be helpful to limit the focus of the search operation. Generator can limit the scope of a search in two different ways:
- within a specific course, or
- across a tenant
Each scope is supported through the same search endpoint. The scope of the search is determined by the presence of the content_id query parameter. If the request includes the content_id parameter, the search will be performed against that specific course. If no content_id parameter is included, then the search would be performed across all courses imported into the specified tenant.
The user must include the tenant_id parameter when using the search endpoint. Currently, users are not able to perform searches across multiple tenants in a single request. Searches performed across tenants must be performed separately, though it would be possible for users to aggregate the results of the tenant-specific searches and organize the results by the result’s match_strength.
The Search API Endpoint
Searches are performed through the GET /api/v1/content/search endpoint.
The following information is required with each search request:
- search_type: query parameter that specifies the search behavior. Value must be ‘SEMANTIC’ or ‘LEXICAL’
- term: query parameter that specifies the term to be searched
- tenant_id: header that specifies the tenant identifier in which the search function should be limited.
The following information is optional, but can be used to filter or focus the search query:
- content_id: query parameter that specifies the content identifier to which the search function should be limited. The
content_idandversionproperties should both be defined or both be unset. Default value is ‘’. - version: query parameter that specifies which version of the content identified by the
content_idparameter to which the search function should be limited. Thecontent_idandversionproperties should both be defined or both be unset. Default value is ‘’. - tag_id: query parameter that filters which content is visible to the search function. Only content that has been tagged with a tag included in the
tag_idparameter can be considered when performing the search. Default value is ‘[]’. - skip: query parameter that indicates the top-n number of search results to skip. Useful when paging search results. Default value is 0.
- include_inactive: query parameter that determines whether or not to include ‘inactive’ content in the search results. Default value is
False. - return_match: query parameter that determines whether or not Generator should return information about the matched search term with the search results. Is only available to lexical searches. Default value is
False. - field: query parameter that filters the metadata field property values that can be used in the search. Is only available to field searches. Default value is ’’ (no field filters are applied).
For example, consider the following API request performed against a tenant tenant1:
GET /api/v1/content/search?search_type=SEMANTIC&term=mode&skip=0
The application would perform a search against all courses under tenant tenant1. The search would be a semantic search, would search on the term mode, and would not skip any results.
An example response:
[
{
"content_id": "example_content_2",
"version": 1,
"tenant_id": "tenant1",
"match_strength": 0.5288990669056727
},
{
"content_id": "example_content_2",
"version": 3,
"tenant_id": "tenant1",
"match_strength": 0.4942204812129025
},
{
"content_id": "example_content_2",
"version": 2,
"tenant_id": "tenant1",
"match_strength": 0.4798217093004602
},
{
"content_id": "example_content_1",
"version": 4,
"tenant_id": "tenant1",
"match_strength": 0.44236816525549805
}
]
If you wanted to dig deeper into a specific course, you could perform a search against that specific course. For example:
GET /api/v1/content/search?search_type=SEMANTIC&term=mode&content_id=example_content_2&skip=0
An example response:
[
{
"location": "Mode/Determining the Mode",
"content_id": "example_content_2",
"version": 2,
"tenant_id": "tenant1",
"match_strength": 0.5288990669056727
},
{
"location": "Mode/What is Mode?",
"content_id": "example_content_2",
"version": 2,
"tenant_id": "tenant1",
"match_strength": 0.4942204812129025
}
]