Semantic and Lexical Search
Generator supports the ability to perform searches against the content library for specific words or phrases. Generator can perform both lexical and semantic searches across a content library or within a specific course.
Keyword or lexical search relies on matching exact words or phrases. This is the common behavior for search functionality. This approach is relatively simple and fast, but it has limitations. For example, it may not be able to handle misspellings, and it does not take into account the context or meaning of the words, which can lead to irrelevant results.
Semantic search, on the other hand, is a little bit different; the underlying meaning of the search term or phrase is considered when performing the search. The results of a semantic search will be based on their relevance to the search term.
Once the search has been performed, Generator will return a list of search results. Each entry in the response list will include the following information:
- tenant_id: the tenant identifier
- content_id: the learning content identifier
- location: a string representation of the search-term match location within the course. The value of this property is reflected differently according to the specific type or author of learning content. For example, for an
mp3
course, the ’location’ value would reflect a timestamp, whereas for packaged content, the ’location’ value could reflect the name or navigation to a specific chapter or slide. - match_strength: a float value between 0 and 1 that represents the match strength, where a value of 1 represents a perfect match. This property is only included when performing a semantic search.
Search Scopes
When performing a search, it may be helpful to limit the focus of the search operation. Generator can limit the scope of a search in two different ways:
- within a specific course, or
- across a tenant
Each scope is supported through the same search endpoint. The scope of the search is determined by the presence of the content_id
query parameter. If the request includes the content_id
parameter, the search will be performed against that specific course. If no content_id
parameter is included, then the search would be performed across all courses imported into the specified tenant.
The user must include the tenant_id
parameter when using the search endpoint. Currently, users are not able to perform searches across multiple tenants in a single request. Searches performed across tenants must be performed separately, though it would be possible for users to aggregate the results of the tenant-specific searches and organize the results by the result’s match_strength
.
The Search API Endpoint
Searches are performed through the GET /api/v1/content/search
endpoint.
The following information is required with each search request:
- search_type: query parameter that specifies the search behavior. Value must be ‘SEMANTIC’ or ‘LEXICAL’
- term: query parameter that specifies the term to be searched
- tenant_id: header that specifies the tenant identifier in which the search function should be limited.
The following information is optional:
- content_id: query parameter that specifies content identifier in which the search function should be limited. Default value is ‘’.
- skip: query parameter that indicates the top-n number of search results to skip. Useful when paging search results. Default value is 0.
For example, consider the following API request performed against a tenant tenant1
:
GET /api/v1/content/search?search_type=SEMANTIC&term=mode&skip=0
The application would perform a search against all courses under tenant tenant1
. The search would be a semantic search, would search on the term mode
, and would not skip any results.
An example response:
[
{
"location": "Mode/Determining the Mode",
"content_id": "example_content_2",
"tenant_id": "tenant1",
"match_strength": 0.5288990669056727
},
{
"location": "Mode/What is Mode?",
"content_id": "example_content_2",
"tenant_id": "tenant1",
"match_strength": 0.4942204812129025
},
{
"location": "Mode/No Mode/Two Modes",
"content_id": "example_content_2",
"tenant_id": "tenant1",
"match_strength": 0.4798217093004602
},
{
"location": "Introduction/Introduction",
"content_id": "example_content_1",
"tenant_id": "tenant1",
"match_strength": 0.44236816525549805
}
]