Metadata Generation

Metadata Generation

One of Generator’s key features is the generation of metadata values for your learning content. Metadata refers to the properties and information about a given piece of learning content that can be used to help catalog content libraries.

The following metadata fields are supported:

  • PLUG: A short one-sentence summary
  • BLURB: A snappy, two or three sentence marketing summary
  • SUMMARY: A paragraph that summarizes the content of the course
  • TITLE: A short, appealing title
  • KEYWORDS: A selection of words or phrases that capture the themes covered within the course’s content
  • SKILLS: A selection of marketable skills that would be improved by completing the course
  • QUESTIONS: A collection of multiple-choice questions and answers that review the covered topics of the learning content

How to Generate Metadata

Generator supports the generation of metadata either as a step during the content import process or after the content’s import through a dedicated “field generation” job.

  • Content Import Job: Import job is initialized via any of the POST /api/v1/jobs/* endpoints (see here for more information). The request schema for content import jobs include a field_generation property. Here, users can populate the fields_to_generate list, which defines the set of metadata fields that should be generated as part of the content import job. Any number of fields can be included in the fields_to_generate property, and the order of the specified metadata properties is inconsequential. Generator will take steps to prevent redundant metadata generation.
  • Dedicated Field Generation Job: POST /api/v1/jobs/field endpoint. Used to generate metadata field values for content that has already been imported into Rustici Generator. Only one field can be requested at a time using this endpoint. The request schema for this endpoint includes a force_regenerate property, which specifies how the application should behave in the case where the requested metadata property already exists for the specified course.

Field Generation Customization

Field generation schemas include a field_generation property, which can be used to customize certain behaviors of the metadata field generation job.

  • fields_to_generate: The list of metadata fields to be generated.
  • parameters:
    • count: The number of metadata fields to generate
    • model: The specific language model to use for generation. Check the settings page here for the list of SupportedGenerationModel options
    • output_language_code": Generator currently offers limited support around language configuration for the generated metadata fields. This property accepts a language code that is used to specify the output language of the generated metadata fields. The only supported language/dialect options at this time are en-US and en-GB, but please reach out to our support team if you need wider language support.

Here’s an example field_generation schema:

{
  "fields_to_generate": [
    "SUMMARY",
    "KEYWORDS"
  ],
  "parameters": {
    "count": 4,
    "model": "CLAUDE_3_HAIKU",
    "output_language_code": "en-GB"
  }
}

Metadata Dependencies

In some cases, the metadata generation process may require the presence of other metadata values as a prerequisite. Specifically, the presence of SUMMARY metadata is a prerequisite for the generation of TITLE, KEYWORD, and SKILLS metadata.

Although the generation jobs enforce this dependency, users are not expected to generate metadata fields in a particular order. If any prerequisite metadata doesn’t yet exist, Generator will include the generation of the prerequisite metadata as part of the configuration of the requested generation job.

Viewing Generated Metadata

To view the generated metadata for a given course, users should use the GET /api/v1/content/{content_id}/versions/{version}/fields/{field} endpoint. This endpoint returns a list of values (and other information) for the requested metadata field.

For example, the request GET /api/v1/content/demo_course/field/TITLE can be used to request the generated TITLE metadata values for a course called demo_course. The the response would look similar to this:

[
    {
        "content_id": 2,
        "field_type": "TITLE",
        "value": "Understanding the Deeper Bond with Your Cat",
        "create_dt": "2024-10-07T19:16:17",
        "update_dt": "2024-10-07T19:16:17"
    },
    {
        "content_id": 2,
        "field_type": "TITLE",
        "value": "Communicating with Your Feline Soulmate",
        "create_dt": "2024-10-07T19:16:17",
        "update_dt": "2024-10-07T19:16:17"
    }
]

One potential exception to this response format involves the QUESTIONS metadata field. QUESTIONS metadata can be exported from the application in a variety of formats, and the requested format impacts the Content-Type header value of the API response. When requesting QUESTIONS metadata, the endpoint will accept a format query parameter that defines how the QUESTIONS should be formatted. Available format values:

  • JSON - default - the application will return an 'application/json' response of serialized Question schemas
  • CSV - the application will return an 'application/csv' response. The CSV headers will be formatted as such: “question,A,B,[…],answer” (the number of choices can vary).
  • AIKEN - the application will return a 'text/plain' response. The text will be the questions and answers in Aiken format.

Manually Setting Metadata Values

In addition to generating new metadata for your learning content, Generator supports manual setting of metadata values through the API. Users may already have metadata information, such as course summaries or titles, for their learning content and may wish to use those values as the prerequisite information for other generation processesses. Additionally, users may want to adjust the metadata values generated by Rustici Generator before exporting them for further use.

To manually set metadata values for learning content, use the PUT /api/v1/content/{content_id}/versions/{version}/fields/{field} endpoint. The request schema for this endpoint includes a values property that accepts a list of string values. For metadata properties that can be configured with multiple values (specifically SKILLS and KEYWORDS), the values list can contain multiple entries. For all other metadata properties, the values list must only contain a single entry.

For example, to set the TITLE for your course, use the PUT /api/v1/content/demo_course/field/TITLE endpoint with the following schema:

"value": [
    "Ask Your Cat: Tips for Communicating with Your Feline Soulmate"
]

Setting Questions Metadata Values

When manually setting the QUESTIONS metadata, a specific schema is expected.

Questions contain the following properties:

  • question: required - This is the question that is asked
  • choices: required - A list of QuestionComponentChoice objects that define the possible answers to the asked question (see below)
  • location: optional - Defines the text location that corresponds to the course material covered by the question. If provided, this value must correspond to an existing location defined by the course text.

The QuestionComponentChoice object defines question choices and contain the following properties:

  • text: required - The text for a possible answer to the question
  • correct: required - a boolean value that specifies if the answer to the question is the correct choice. There must be only a single correct answer across all choices for the question.

Here is an example request:

"value": [
  {
    "question": "Which animal has the most teeth?",
    "choices": [
      {
        "text": "Humans",
        "correct": false
      },
      {
        "text": "Elephants",
        "correct": false
      },
      {
        "text": "Snails",
        "correct": true
      },
      {
        "text": "Honeybee",
        "correct": false
      }
    ],
    "location": "Unusual Animal Facts"
  }
]

Limitations: At this time, only multiple choice questions are supported, and users must use a JSON format when setting a value for QUESTIONS metadata. Question choices must contain at least 1 incorrect choice and only 1 correct choice.

Metadata Regeneration

In an effort to reduce costs and prevent any unintentional overwriting of metadata values, Generator has taken steps to ensure that the application will not make unnecessary generation requests to the AI model. In some circumstances, however, users may be unsatisfied with the results of a metadata generation job. In these situations, users must explicitly request the regeneration of a metadata field. To do this, users should use the POST /api/v1/jobs/field endpoint to start another field generation job and set the force_regenerate schema property to true. When the force_regenerate property is true, the application will discard the existing metadata values for the specified property and request new values from the AI model. Generator will take steps to make sure the metadata values that result from the regeneration process are different from the original values.