Deploy & Host AI Agents the Right Way

You built an agent that works on your laptop. Now it needs to answer a WhatsApp message at 2am while you're asleep, survive a Razorpay outage without double-charging, and never leak your OpenAI key. This guide covers the hosting decisions, webhook plumbing, and operational habits that separate a demo from a product real buyers will pay for.

Pick your hosting model

Agents are mostly idle, then suddenly busy. How you host them determines your monthly bill, your cold-start latency, and how much ops work you sign up for.

Model	Best for	Cold starts	Rough cost (small agent)	Watch out for
Serverless (Vercel, Cloudflare Workers, AWS Lambda)	Bursty webhook traffic, low/spiky volume	Yes (100ms–2s)	₹0–800/mo, often free tier	Long-running jobs time out; no in-memory state
Always-on container (Railway, Render, Fly.io, a ₹400 VPS)	Steady traffic, background workers, websockets	No	₹400–2,000/mo	You own restarts, scaling, patching
Managed / hosted-by-marketplace	You want to sell, not babysit infra	N/A	Revenue share	Less control over runtime

Practical default for an Indian solo builder: start on serverless for the webhook endpoint (free tier covers early traffic), and add one small always-on container only when you need a queue worker or persistent connection. Don't pay for a fleet before you have customers.

If hosting infra isn't the business you want to be in, listing on AgentDukaan lets AgentDukaan host the agent for subscribers while you keep building — buyers subscribe, you get paid, no PM2 at midnight.

Webhooks 101 (and why your agent can't live without them)

Your agent doesn't sit refreshing WhatsApp asking "any new messages?" That's polling — wasteful and slow. Instead, WhatsApp, Telegram, Razorpay, and in-app chat push an HTTP POST to a URL you register. That URL is your webhook.

The lifecycle:

You expose POST https://youragent.in/webhooks/whatsapp and register it with the provider.
An event happens (customer sends "where's my order?").
The provider POSTs a JSON payload to your URL.
You verify it's genuine, return 200 fast, then do the slow work.

Two rules that trip people up:

Verify the signature. Anyone who learns your URL can fake a payload. Every serious provider signs requests. Check the signature before trusting a single byte:

// Razorpay webhook verification (Express)
import crypto from "crypto";

app.post("/webhooks/razorpay", (req, res) => {
  const expected = crypto
    .createHmac("sha256", process.env.RAZORPAY_WEBHOOK_SECRET)
    .update(JSON.stringify(req.body))
    .digest("hex");

  if (expected !== req.headers["x-razorpay-signature"]) {
    return res.status(400).send("bad signature");
  }
  res.sendStatus(200);          // ack first
  void handleEventAsync(req.body); // process after
});

Reply in under ~5 seconds. Providers retry if you're slow, which creates duplicates. Acknowledge with 200 immediately, then hand the actual LLM call / order lookup to a background job or queue. Never run a 12-second GPT-4 call before returning the webhook response.

For local testing, expose your machine with ngrok http 3000 or cloudflared tunnel and register the temporary URL — but never ship a tunnel to production.

Handle secrets like they're radioactive

A hardcoded API key in your repo is a key on Pastebin within hours of going public. The rules are simple and non-negotiable:

Never commit keys. Use environment variables; keep .env in .gitignore.
Store production secrets in your host's secret manager (Railway Variables, Vercel Environment Variables, AWS Secrets Manager) — not in a config file you scp around.
Use separate keys for dev and prod, so a leaked test key can't drain your live wallet.
Rotate when anyone leaves the project or a key might be exposed. Assume rotation will happen; build for it.
Scope keys to the minimum permission needed.

For agents you sell as source code, never bake your keys into the bundle. The buyer supplies their own credentials at setup. AgentDukaan's hosted agents use a pull-token model — secrets are fetched per-run and never shipped to sellers — which is the pattern to copy: the runtime requests the secret when it needs it rather than carrying it around.

Under India's DPDP Act, customer phone numbers and chat content are personal data. Don't log full message bodies in plaintext, don't store more than you need, and have a deletion path. A WhatsApp support agent that dumps every customer's number into a public log is a liability, not a feature.

Idempotency, retries, and rate limits

Because providers retry, your agent will receive the same event twice. If "payment captured" runs twice, you ship two orders or send two confirmations. The fix is idempotency: dedupe on the event ID.

async function handleEventAsync(event) {
  const seen = await db.events.findOne({ id: event.id });
  if (seen) return;                     // already handled — no-op
  await db.events.insertOne({ id: event.id, at: new Date() });
  await doTheWork(event);
}

When you call out to OpenAI, Razorpay, or WhatsApp:

Retry with exponential backoff on 429 and 5xx (e.g. wait 1s, 2s, 4s), with a cap and jitter. Never retry in a tight loop — you'll make a rate limit worse.
Respect rate limits. WhatsApp throttles message sends; LLM providers have tokens-per-minute caps. Queue outbound work and drain it at a safe rate instead of firing 500 messages at once.
Set timeouts on every external call. A hung LLM request shouldn't freeze the whole agent.

A small in-memory or Redis-backed queue between "webhook received" and "work done" solves idempotency, retries, and rate limiting in one place.

Logging, health checks, and uptime

You can't fix what you can't see. Minimum viable observability:

Structured logs (JSON, not console.log("here")) with a request ID so you can trace one customer's journey across services. Redact secrets and PII.
A health endpoint — GET /healthz returning 200 plus a quick check that your DB and key dependencies are reachable. Hosts use it to restart dead instances.
Uptime monitoring — point a free pinger (UptimeRobot, Better Stack, or a cron job) at /healthz every minute and alert to Telegram or email when it fails. You want to know before your customer does.
Error alerts — send unhandled exceptions to a channel you actually watch.

GET /healthz  →  200 { "ok": true, "db": "up", "ts": "..." }

Track three numbers from day one: success rate of runs, p95 response latency, and error count per hour. If any spikes, you have a problem worth waking up for.

Go-live checklist

Run through this before you flip the switch on a paying agent:

A prompt to pressure-test your own agent

Paste this into Claude or ChatGPT with your code or architecture notes:

You are a senior platform/SRE engineer reviewing an AI agent before it
goes live for paying customers in India. The agent handles [WhatsApp /
Telegram / email] traffic and calls [LLM + payment APIs].

Review the design/code below and list, in priority order:
1. Any place a secret could leak or is hardcoded.
2. Where a duplicate or retried webhook would cause double-action.
3. Missing signature verification, timeouts, or backoff.
4. What breaks under 10x traffic and how to fix it cheaply.
5. DPDP/PII risks in logging or storage.

For each, give the specific fix. Be blunt. Here is the code/architecture:
[paste here]

Next steps

Add a /healthz endpoint and a free uptime monitor to your agent today — it's 20 minutes of work and the highest-leverage thing on the list.
Move every key into your host's secret manager and rotate anything that's been in a repo.
Run the pressure-test prompt above against your code before launch.
When you're ready to put it in front of buyers without running infra yourself, see how hosted listings work on AgentDukaan, or read The Production-Ready Agent Checklist to round out the product side. Browse what's already live at AgentDukaan for patterns worth copying.

No rush — ship it when the checklist is green.

More in Build & Deploy Agents

Build Your First AI Agent: Idea to Live in a Weekend

A beginner weekend playbook to ship your first AI agent, built around a copy-paste WhatsApp FAQ example with testing and go-live steps.

₹1,499FreeRead →

The Production-Ready Agent Checklist

A staff-engineer review checklist covering reliability, safety, cost, observability, and DPDP privacy, with a printable launch list.

₹999FreeRead →

Want the agent, not just the guide?

Browse ready-made AI agents or list your own on AgentDukaan.

Browse agents All free guides