Dependency

Estimated reading: 7 minutes

Dependency Groups Overview

The following dependency groups will provide the technical foundation for building, orchestrating, and operating AI-driven flows and agents across the platform. Each group will address a specific layer of functionality, from core orchestration to observability and infra integration.

Core Flow & Agent Frameworks

This group will form the backbone of the platform’s orchestration layer.

langflow-base and langchain will provide the visual and programmatic building blocks for creating flows, chains, and agents that combine LLMs, tools, and data sources.
dspy-ai, pydantic-ai, and smolagents will enable more structured, programmatic agent design, helping you define tool-using agents with typed inputs/outputs, deterministic behavior, and optimization capabilities.
litellm will act as a unified LLM interface and proxy, allowing the platform to route calls across multiple model providers while centralizing configuration, logging, and throttling.
mcp (Model Context Protocol) will standardize how agents will talk to external tools and services, making the platform more extensible and interoperable.

Collectively, this group will orchestrate how flows and agents will execute end-to-end across the platform.

LangChain Integrations

This group will unlock the wider AI ecosystem through plug-and-play integrations.

Model provider integrations (langchain-google-genai, langchain-cohere, langchain-anthropic, langchain-openai, langchain-groq, langchain-mistralai, langchain-aws, langchain-nvidia-ai-endpoints, langchain-sambanova, etc.) will let flows switch between different LLMs and embedding models without changing the core logic.
Vector and storage integrations (langchain-pinecone, langchain-chroma, langchain-milvus, langchain-astradb, langchain-elasticsearch, langchain-
mongodb, langchain-chroma, langchain-graph-retriever) will allow retrieval-augmented generation (RAG) patterns across multiple databases.
Tooling and service integrations (langchain-unstructured, langchain-google-vertexai, langchain-google-calendar-tools, langchain-google-community, langchain-ollama, langchain-community) will give the platform access to document loaders, calendar operations, local models, and community-maintained connectors.

This group will ensure that LangChain-based flows will connect seamlessly to the models, storages, and services required by enterprise use cases.

AI / LLM Provider SDKs & Tools

These dependencies will provide direct access to commercial AI providers and LLM utilities outside LangChain, giving more flexibility in how the platform will use models.

Provider SDKs such as openai, huggingface-hub[inference], qianfan, ibm-watsonx-ai, langchain-ibm, assemblyai, twelvelabs, and jigsawstack will let the platform call text, vision, speech, and multimodal models directly.
mem0ai will add long-term memory capabilities for agents, improving continuity across sessions.
cleanlab-tlm will help evaluate and improve output quality, especially around noisy or low-quality labels.
gassist will introduce platform-specific helper capabilities on Windows where required.

This group will allow the platform to mix high-level orchestration (via LangChain) with low-level, provider-specific capabilities when needed.

Vector DB & Retrieval

This group will power semantic search, embeddings storage, and graph-based retrieval across structured and unstructured content.

Hosted or managed vector stores such as qdrant-client, weaviate-client, zep-python, upstash-vector, astra-assistants[tools], and metal_sdk will let the platform scale RAG workloads with external services.
Local / embedded vector solutions like chromadb, faiss-cpu, and pgvector will support on-prem or lightweight deployments.
Search-layer integrations like opensearch-py and elasticsearch will combine full-text and vector search patterns.
graph-retriever will allow graph-based retrieval, enabling more advanced knowledge graph and relationship-based querying.

Together, this group will enable flexible retrieval strategies ranging from simple document search to complex knowledge graph–driven reasoning.

Databases, Storage & ORM

This group will handle application state, logs, configuration, and analytical data.

sqlalchemy (with aiosqlite and Postgres drivers) will provide the primary ORM and SQL abstraction for relational data.
aiosqlite, pymongo, supabase, oracledb, and pyodbc will allow the platform to connect to SQLite, MongoDB, Supabase (Postgres), Oracle, and ODBC-compatible databases.
redis will be used for caching, queues, and ephemeral state management.
azure-storage-blob will support blob storage for large artifacts such as files, logs, and attachments.
Serialization and analytics tools like fastavro, pyarrow, and fastparquet will handle efficient storage and processing of large datasets and event streams.

This group will ensure the platform will persist and manage data in a reliable, scalable, and provider-agnostic way.

Cloud & External Platform SDKs

These SDKs will connect flows and agents to cloud infrastructure and external SaaS platforms.

boto3 will enable interactions with AWS services (S3, Lambda, Bedrock, etc.).
kubernetes will allow the platform to query and manage K8s resources, for example to orchestrate workloads or monitor deployments.
google-api-python-client will open up access to Google services like Drive, Sheets, and Gmail.
msal will provide secure authentication against Microsoft identity platforms (Azure AD).
atlassian-python-api will connect to Jira, Confluence, and other Atlassian tools for ticketing and documentation workflows.

This group will let agents move beyond pure “chat” and interact with real operational environments.

Web Search, Scraping & Data Sources

This group will allow the platform to pull information from the public web and external data sources for RAG and enrichment.

Search wrappers such as google-search-results, duckduckgo_search, metaphor-python, yfinance, and wolframalpha will provide search, semantic search, financial data, and computational knowledge.
Web automation and scraping tools like scrapegraph-py, apify-client, spider-client, beautifulsoup4, and fake-useragent will allow robust crawling and HTML extraction while mimicking realistic user agents.
Content APIs like wikipedia and datasets (Hugging Face Datasets) will provide ready-to-use corpora and reference data.

This group will ensure that agents will ground their reasoning in real-world, up-to-date information where appropriate.

NLP, Parsing & Data Processing

These libraries will handle text processing, parsing, structural transformations, and numeric computation.

nltk and lark will support classic NLP tasks and grammar-based parsing (e.g., for DSLs or configuration languages).
jq and json_repair will help query, transform, and repair JSON structures produced by LLMs or external APIs.
Markdown will convert Markdown documents into HTML, which the platform will then reduce to text as needed.
Numeric and scientific computation will be handled by numexpr and scipy.
networkx will model and analyze graph structures, useful for knowledge graphs and flow graphs.
docling_core and docling will manage document parsing for complex formats like PDFs and office documents.

This group will give the platform the tools required to normalize, clean, parse, and analyze data before it is passed to models or stored.

Document, OCR & Vision

This group will focus on extracting text and information from documents and images.

easyocr and opencv-python will enable OCR and basic computer vision operations for scanned documents, screenshots, and image-based inputs.
pdfkit will convert HTML content back into PDF, allowing the platform to generate human-readable reports or exports.

These capabilities will allow the platform to handle both digital and scanned content end-to-end.

Agent Tooling & Automation Bridges

These dependencies will make it possible for agents to call rich sets of external tools without custom glue code for each integration.

composio and composio-langchain will provide a tool hub that exposes many SaaS and productivity tools (e.g., calendars, task systems, communication apps) as callable actions from within a LangChain agent.

This group will turn agents into real “doers” by enabling workflows that perform actions across the user’s SaaS landscape.

Observability, Tracing & Experimentation

This group will provide full visibility into LLM behavior, performance, and reliability.

langfuse, langwatch, langsmith, opik, and traceloop-sdk will capture traces, metrics, and evaluations of flows and agent runs.
arize-phoenix-otel and openinference-instrumentation-langchain will integrate with OpenTelemetry-style pipelines for standardized tracing.
newrelic will act as an APM layer to monitor application performance and error rates.
pytest-codspeed will support performance benchmarking at the test level, while needle-python will provide code instrumentation and telemetry.

This group will ensure that the platform will remain observable, debuggable, and continuously improvable as workloads grow.

System, Security & Utility Libraries

These cross-cutting libraries will handle configuration, security, logging, and other foundational concerns.

pydantic-settings will standardize configuration via environment variables and typed settings objects.
filelock and aiofile will support safe, concurrent file access, including async workloads.
MarkupSafe and certifi will handle HTML escaping and TLS certificate validation.
cryptography will provide secure primitives for encryption, key handling, and secure communication.
uv will improve Python environment and dependency management, particularly in containerized deployments.
GitPython will allow programmatic interactions with Git repositories (e.g., pulling flows or scripts from version control).
sseclient-py, types-cachetools, and structlog will support streaming, caching, and structured logging.

This group will underpin the non-functional aspects of the platform—security, reliability, and maintainability.

Feature Flags & Config

This group will control runtime behavior via feature flags.

unleashclient will let the platform expose and evaluate feature flags, enabling gradual rollouts, experiments, and tenant-specific configurations without redeploying core services.

Feature-flagging will be critical for safely evolving the platform and testing new capabilities.

Platform-Specific (Windows / Conditional)

This group will cover OS-specific needs.

pywin32 will expose Windows APIs (COM, registry, etc.), enabling automations or integrations that are only possible or necessary on Windows environments.

This group will be used selectively, ensuring the platform will support Windows-specific scenarios without affecting cross-platform deployments.

Dependency

Dependency Groups Overview

Core Flow & Agent Frameworks

LangChain Integrations

AI / LLM Provider SDKs & Tools

Vector DB & Retrieval

Databases, Storage & ORM

Cloud & External Platform SDKs

Web Search, Scraping & Data Sources

NLP, Parsing & Data Processing

Document, OCR & Vision

Agent Tooling & Automation Bridges

Observability, Tracing & Experimentation

System, Security & Utility Libraries

Feature Flags & Config

Platform-Specific (Windows / Conditional)

Dependency

CONTENTS