Text Extraction
Generator provides the ability to extract the text content from your learning materials. This feature allows you to retrieve the parsed text from any piece of content that has been imported into Generator, making it useful for analysis, processing, or integration with external systems.
The text extraction endpoint supports multiple output formats, each designed for different use cases:
- TEXT: Returns the entire course’s text as plain text
- JSON: Returns the text broken up by location, including estimated token sizes
- VTT: Returns the transcription in WebVTT format (only available for content that required transcription, such as MP3 and MP4 files)
Text Extraction via the API
Text extraction is performed through the GET /api/v1/content/{content_id}/versions/{version}/text endpoint.
The following information is required with each request:
- content_id: path parameter that specifies the content identifier
- version: path parameter that specifies the version of the content
- tenant_id: header that specifies the tenant identifier
The following information is optional:
- format: query parameter that specifies the output format. Accepts
text,json, orvtt. Default value istext. - include_interactions: query parameter that specifies whether to include interaction data in the response. Default value is
false. See the Interactions section below for more information.
Response Formats
TEXT Format
When using the text format, Generator returns the entire course’s text as a single plain text string. The response body contains the full text of the course as plain text, with a Content-Type of text/plain.
By default, interactions are excluded from the text. When include_interactions=true is specified, the interaction text (such as questions and answer choices) will be included in the course text at the location where the interaction appears.
This format is useful when you need the complete text content in a simple, readable format without any additional structure or metadata.
Example Response:
Welcome to this course on advanced topics...
In this chapter, we will explore...
Now that we understand the basics...
JSON Format
When using the json format, Generator returns the text broken up by location, with each location including its text content and an estimated token count. The response has a Content-Type header of application/json.
The JSON response follows this schema:
- locations: a dictionary where each key is a location string and each value is a
TextLocationResponseobject containing:- text: the text content for that location
- token_estimation: an estimated token count for the text at that location
- interactions: (optional) an array of interaction objects. This field is only present when
include_interactions=true. See the Interactions section for details on the interaction structure.
- token_estimation: the total estimated token count for all locations
By default, interactions are excluded from the response. When include_interactions=true is specified, interactions are included in a separate interactions field for each location.
This format is useful when you need to understand the structure of the content, work with specific sections, need token count information for AI processing, or want to extract interaction data separately from the main text.
Example Response (without interactions):
{
"locations": {
"Introduction": {
"text": "Welcome to this course on advanced topics...",
"token_estimation": 45
},
"Chapter 1: Getting Started": {
"text": "In this chapter, we will explore...",
"token_estimation": 120
},
"Chapter 2: Advanced Concepts": {
"text": "Now that we understand the basics...",
"token_estimation": 200
}
},
"token_estimation": 365
}
Example Response (with interactions):
{
"locations": {
"Introduction": {
"text": "Welcome to this course on advanced topics...",
"token_estimation": 45,
"interactions": []
},
"Chapter 1: Getting Started": {
"text": "In this chapter, we will explore...",
"token_estimation": 120,
"interactions": [
{
"type": "multiple_choice",
"question": "What is the capital of France?",
"choices": [
{
"text": "London",
"is_correct": false,
"category": null
},
{
"text": "Paris",
"is_correct": true,
"category": null
},
{
"text": "Berlin",
"is_correct": false,
"category": null
}
],
"range": null,
"feedback": []
}
]
},
"Chapter 2: Advanced Concepts": {
"text": "Now that we understand the basics...",
"token_estimation": 200,
"interactions": []
}
},
"token_estimation": 365
}
VTT Format
When using the vtt format, Generator returns the transcription in WebVTT (Web Video Text Tracks) format. The response has a Content-Type header of text/vtt.
Important:
- The VTT format is only available for content that required transcription during the import process, such as direct MP3 audio files and MP4 video files. If you attempt to retrieve VTT format for content that does not have a transcription file (such as packaged content), the endpoint will return a 400 error with the message “Content does not have a VTT file”.
- The VTT format does not support interactions. If you specify
include_interactions=truewith the VTT format, the endpoint will return a 400 error with the message “VTT format does not support interactions”.
This format is useful when you need to work with time-stamped transcriptions, create captions, or integrate with video players that support WebVTT.
Example Response:
WEBVTT
00:00:00.000 --> 00:00:05.500
Welcome to this audio course on advanced topics.
00:00:05.500 --> 00:00:12.300
In this first section, we will explore the fundamentals.
00:00:12.300 --> 00:00:20.100
Let's begin by understanding the core concepts.
Interactions
Many e-learning authoring tools include interactive elements such as quizzes, questions, and assessments within their content. Generator can extract and include these interactions in the text extraction output when the include_interactions parameter is set to true. See here for more information about interactions.