Custom LLM

Estimated reading: 2 minutes

The Custom LLM component in Robility Flow is designed to generate text responses from external LLMs using the user’s input, system message, and configured parameters. It connects with the Robility Manager to retrieve pre-stored configurations such as models, API keys, and endpoints.

This component enables automated workflows to perform intelligent text generation and response synthesis with providers like Azure OpenAI and Google Vertex AI, eliminating hardcoded configurations in individual workflows.

Note:
Ensure that the required LLM configurations are created in Robility Manager (LLM Configuration Management), since the Custom LLM component retrieves all model and provider details from it.

Parameter

Parameter	Description
Input*	Specifies the primary prompt or input text sent to the language model for processing.
System Message*	Specifies the instructions that guide the model’s behavior, tone, and response style.
Stream	Enables or disables streaming responses, allowing outputs to be received incrementally as they are generated. Only works in chat mode. • Enabled: Response arrives word-by-word as generated. • Disabled: Waits for the full response and delivers it all at once.
Execution Type*	Specifies how the request is executed.
Model Provider*	Selects the LLM provider.
Azure API Version	Specifies the Azure OpenAI API version to be used for the request.
Max Tokens	Maximum total tokens (input + output) the model can process. Helps fit within context limits.
Vertex Location	Specifies the Google Vertex AI region where the model is hosted.
Max Output Tokens	Maximum number of tokens the model can generate in response. Helps control the length of the output.
Max Retries	Specifies the number of retry attempts if the model request fails. Verbose must be enabled for retries to work.
Top K	Limits the number of candidate tokens considered at each generation step. • Lower values: More precise and stable output. • Higher values: More diverse and creative output.
Top P	Controls the selection of probable next words until their total probability reaches a threshold. • Lower values: More focused and predictable output. • Higher values: More diverse and flexible responses.
Verbose	Enables detailed logging. Must be enabled for Max Retries to work.

*Required Fields

Output

Output	Description
Model Response	Generated text response produced by the configured model.
Language Model	The model that produces the response, returned in JSON format.

Custom LLM

Parameter

Output

Custom LLM

CONTENTS