Custom LLM

Estimated reading: 2 minutes

The Custom LLM component in Robility Flow is designed to generate text responses from external LLMs using the user’s input, system message, and configured parameters. It connects with the Robility Manager to retrieve pre-stored configurations such as models, API keys, and endpoints.

This component enables automated workflows to perform intelligent text generation and response synthesis with providers like Azure OpenAI and Google Vertex AI, eliminating hardcoded configurations in individual workflows.

Note:
Ensure that the required LLM configurations are created in Robility Manager (LLM Configuration Management), since the Custom LLM component retrieves all model and provider details from it.

Parameter

Parameter Description
Input* Specifies the primary prompt or input text sent to the language model for processing.
System Message* Specifies the instructions that guide the model’s behavior, tone, and response style.
Stream Enables or disables streaming responses, allowing outputs to be received incrementally as they are generated. Only works in chat mode.
• Enabled: Response arrives word-by-word as generated.
• Disabled: Waits for the full response and delivers it all at once.
Execution Type* Specifies how the request is executed.
Model Provider* Selects the LLM provider.
Azure API Version Specifies the Azure OpenAI API version to be used for the request.
Max Tokens Maximum total tokens (input + output) the model can process. Helps fit within context limits.
Vertex Location Specifies the Google Vertex AI region where the model is hosted.
Max Output Tokens Maximum number of tokens the model can generate in response. Helps control the length of the output.
Max Retries Specifies the number of retry attempts if the model request fails. Verbose must be enabled for retries to work.
Top K Limits the number of candidate tokens considered at each generation step.
• Lower values: More precise and stable output.
• Higher values: More diverse and creative output.
Top P Controls the selection of probable next words until their total probability reaches a threshold.
• Lower values: More focused and predictable output.
• Higher values: More diverse and flexible responses.
Verbose Enables detailed logging. Must be enabled for Max Retries to work.

*Required Fields

Output

Output Description
Model Response Generated text response produced by the configured model.
Language Model The model that produces the response, returned in JSON format.
Share this Doc

Custom LLM

Or copy link

CONTENTS