AI Native with python ecosystem-LangChain/LangGraph/LangSmith
- Anand Nerurkar
- Nov 26
- 4 min read
What LangSmith actually does
How it integrates with LangChain & LangGraph
Exact configuration (env + code)
What gets traced automatically
How this works in AKS / production
How this maps to Spring / Java world
1️⃣ What LangSmith Actually Is (Architect View)
LangSmith is a managed telemetry + evaluation backend for:
Prompt execution
Tool calls
Agent steps
RAG retrieval
LLM latency, cost, errors
Automatic trace trees (like OpenTelemetry for LLMs)
Think of it as:
“Application Performance Monitoring (APM) for LLM workflows.”
It does NOT:
Host your models
Expose APIs
Execute agents
It only collects traces + logs + metrics.
2️⃣ How LangSmith Integrates with LangChain & LangGraph
Very important point:
✅ LangSmith does NOT require explicit logging calls in most cases.
LangChain & LangGraph already emit internal events.LangSmith just subscribes to them automatically when enabled.
So the flow is:
LangChain / LangGraph
↓ (auto trace hooks)
LangSmith SDK
↓ (HTTPS export)
LangSmith Cloud
No manual logger.info() needed for LLM steps.
3️⃣ Exact Configuration Needed (Minimal & Production)
✅ Step 1: Create LangSmith API Key
From LangSmith UI:
Create Workspace
Create API Key
✅ Step 2: Environment Variables (Mandatory)
These 3 are enough to turn on tracing:
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="lsm_xxxxxxxxxxxxx"
export LANGCHAIN_PROJECT="rag-prod"
Optional but recommended:
export LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
export LANGCHAIN_SESSION="tenant-abc-session-1"
✅ Step 3: Your Existing LangChain Code (No Change!)
Your normal LangChain code:
from langchain_openai import AzureChatOpenAI
from langchain.chains import RetrievalQA
llm = AzureChatOpenAI(deployment_name="gpt-4o")
qa = RetrievalQA.from_chain_type(
llm=llm,
retriever=retriever
)
response = qa.invoke("What is the lending policy?")
👉 This will now automatically appear in LangSmith UI as a trace tree.
4️⃣ LangGraph + LangSmith (Production Agent Tracing)
LangGraph is even more powerful with LangSmith.
Example:
from langgraph.graph import StateGraph
graph = StateGraph(State)
graph.add_node("agent", agent_node)
graph.add_node("tool", tool_node)
compiled = graph.compile()
compiled.invoke({"input": "Approve this loan"})
LangSmith will show:
Agent reasoning
Tool calls
State transitions
Failures & retries
Token usage per node
With zero extra code.
5️⃣ What Exactly Gets Traced Automatically
Once enabled, LangSmith captures:
✅ Prompt templates
✅ Final resolved prompt
✅ LLM request + response
✅ Token count + cost
✅ Tool inputs / outputs
✅ Vector search queries
✅ Retrieved chunks
✅ Latency per hop
✅ Agent step graph
✅ Errors + stack traces
This is LLM observability at enterprise depth.
6️⃣ AKS / Kubernetes Production Setup
In AKS you never hardcode keys. You use:
✅ Kubernetes Secret
apiVersion: v1
kind: Secret
metadata:
name: langsmith-secret
type: Opaque
data:
LANGCHAIN_API_KEY: <base64-encoded-key>
✅ Deployment YAML
env:
- name: LANGCHAIN_TRACING_V2
value: "true"
- name: LANGCHAIN_PROJECT
value: "rag-prod"
- name: LANGCHAIN_API_KEY
valueFrom:
secretKeyRef:
name: langsmith-secret
key: LANGCHAIN_API_KEY
Now every pod automatically streams traces to LangSmith.
7️⃣ How This Works with Azure APIM in Front
Your real prod path:
Client
↓
Azure APIM (JWT, Rate Limit, IP Filter)
↓
AKS Service (FastAPI)
↓
LangChain / LangGraph
↓
Azure OpenAI + Vector DB
↓
LangSmith (Out-of-band HTTPS telemetry)
Important:
LangSmith traffic does NOT go through APIM
It is outbound HTTPS telemetry
Same as:
AppInsights
Datadog
Prometheus Remote Write
8️⃣ How This Maps to Spring / Java World (Critical for You)
LangSmith today is native-first for Python.
In Spring AI / Java, we use:
Python World | Java / Spring World |
LangSmith | OpenTelemetry + AppInsights |
Auto LLM tracing | Manual Span instrumentation |
Agent step tree | Custom Span nesting |
Prompt registry | Git + Config Server |
Equivalent Java setup:
management:
tracing:
enabled: true
azure:
application-insights:
enabled: true
And you manually wrap:
Prompt execution
Tool calls
RAG calls
With OpenTelemetry spans.
So:
✅ Python = LangSmith (native)✅ Java = OpenTelemetry + AppInsights (enterprise)
“LangSmith is not required for RAG or agents to function. It is strictly an observability, debugging, evaluation, and governance layer for LLM workloads.”
✅
“To use LangSmith with LangChain or LangGraph, we only need to enable tracing via environment variables like LANGCHAIN_TRACING_V2 and provide the API key. LangChain internally emits execution events, which LangSmith captures automatically without code changes. In AKS, the API key is managed through Kubernetes Secrets and injected into pods as environment variables. LangSmith receives telemetry through outbound HTTPS. This is equivalent to how OpenTelemetry and Application Insights work for Spring AI in Java.”
“If we do not use LangSmith, then all LLM observability must be implemented using OpenTelemetry. FastAPI is auto-instrumented for HTTP traces, but LangChain, RAG pipelines, and agent steps must be manually wrapped using custom spans for vector search, prompt construction, and LLM inference. These spans are exported using the OTLP exporter to Azure Monitor or Grafana via an OpenTelemetry Collector deployed on AKS. Unlike LangSmith, OpenTelemetry provides infrastructure-level observability but does not give native prompt, agent, token, or RAG explainability out of the box.”
Java Ecosysem
====
✅ Observability (Correct for Spring AI)
“For observability we use OpenTelemetry with Micrometer in Spring Boot, and export traces and metrics to Azure Application Insights and Grafana. We instrument LLM calls, RAG retrieval, and tool executions using custom spans so we get full end-to-end traceability across microservices.”
✅ Evaluation (Correct for Spring AI)
Spring AI + Java does not have LangSmith-like native eval, so you should say:
“For evaluation we use a combination of offline golden datasets, JUnit-based prompt regression tests, and automated quality metrics like answer correctness, context recall, and hallucination rate. These are integrated into our CI/CD pipeline using Azure DevOps.”
Optionally add:
Custom Python eval jobs in MLOps pipeline (if used)
Human-in-the-loop review for BFSI
🎯 Using Spring AI (Java)
“We are using Spring AI on Spring Boot deployed on AKS. For observability we rely on Micrometer with OpenTelemetry and export telemetry to Azure Application Insights and Grafana. For evaluation, we use curated golden datasets, automated regression tests in CI/CD, and domain expert review to measure correctness, hallucination rate, and compliance.”
🎯 “What Tool Do You Use for LLM Observability?”
“It depends on the runtime. For our Java-based Spring AI production workloads, we use OpenTelemetry with Azure Monitor for observability and CI-based evaluation. For Python-based LangChain and LangGraph workloads used in experimentation, we use LangSmith for deep LLM tracing and evaluation.”
.png)

Comments