Blog
Internal Chatbot for Companies – with Claude, AWS Bedrock and MCP
Most enterprise chatbots are glorified FAQ pages. Here is how we build internal AI assistants that actually know your business — using Claude, AWS Bedrock and the Model Context Protocol.
Your team asks the chatbot where to find the latest pricing sheet. It replies with a generic "I don't have access to that information." Sound familiar? Most enterprise chatbots are glorified search boxes with a personality. They sound smart, but the moment they need to touch your actual business data, they hit a wall.
This article walks through how we build internal AI assistants that genuinely know your business — connected to your documents, your CRM, your internal tools — using Claude on AWS Bedrock and the Model Context Protocol (MCP).
Why generic chatbots fail in enterprise environments
The problem isn't the AI model itself. GPT-4, Claude, Gemini — they're all remarkably capable. The problem is context. A model trained on public internet data knows nothing about your internal processes, your product catalogue, your customer history, or your team's way of working.
Companies typically respond to this gap in one of two ways. Either they spend months fine-tuning a model on internal data (expensive, slow, and immediately stale), or they bolt on a basic RAG pipeline that retrieves document chunks and hopes the model figures out the rest. Neither approach scales well and neither gives employees the experience they actually want.
The real need is not a smarter model — it's a model that can take action in your systems. Read a ticket. Look up a customer. Trigger a workflow. That requires a different architecture entirely.
The stack: Claude on AWS Bedrock
We deploy Claude (Anthropic's model family) via AWS Bedrock — Amazon's managed service for foundation models. This combination gives you several enterprise-critical properties out of the box:
- Data privacy: your conversations never leave your AWS account and are never used to train Anthropic's models
- SOC 2, ISO 27001 and HIPAA compliance through AWS infrastructure
- Fine-grained IAM access control — exactly who can use the chatbot and what it can touch
- Pay-per-token pricing with no seat licences, so cost scales linearly with actual usage
- Claude's 200K token context window, meaning it can reason across entire documents in a single turn
Claude is particularly well-suited for enterprise work because of its instruction-following precision and its ability to say "I don't know" rather than hallucinating a confident-sounding answer. In an internal assistant, hallucination is not just annoying — it's a liability.
What is MCP — the Model Context Protocol?
The Model Context Protocol is an open standard introduced by Anthropic in late 2024 that defines how AI models connect to external data sources and tools. Think of it as a USB-C standard for AI integrations: one protocol, any tool.
Before MCP, every AI integration was a custom engineering project. You wrote bespoke code to connect your chatbot to your CRM, another piece of code for your file system, another for your database. Each connector had to be rebuilt when models changed and maintained as APIs evolved.
With MCP, you build an MCP server for each system once — it exposes a standardised set of tools and resources — and any MCP-compatible client (Claude, and now a growing ecosystem of other models and tools) can use them immediately.
MCP dramatically cuts the integration surface. Instead of N model × M tool integrations, you have N + M. One server for your CRM, one for your document store, one for your ticketing system — and any AI assistant can use all of them.
Architecture: how it all fits together
Here is the high-level architecture we use for enterprise internal chatbot deployments:
- 1The employee sends a message via a web or Slack interface.
- 2The request hits an API Gateway endpoint sitting in front of a Lambda function.
- 3The Lambda function initialises an MCP client and calls Claude on AWS Bedrock, passing the user message, the conversation history, and the list of available MCP tools.
- 4Claude reasons about the request. If it needs data, it calls one or more MCP tools — for example: fetch_customer_info, search_documents, or create_support_ticket.
- 5The MCP servers execute the tool calls against the real systems (Salesforce, Confluence, Jira, your internal database) and return structured results.
- 6Claude synthesises the tool results with its own reasoning and returns a final, grounded answer.
- 7The answer — plus any actions taken — is streamed back to the employee.
The key insight is that Claude is not just generating text — it is orchestrating a workflow. It decides when to use tools, in what order, and how to combine the results. This is what makes the assistant genuinely useful rather than a dressed-up search box.
What MCP servers we typically build
Every company has different systems, but the following MCP servers appear in almost every deployment we do:
- Document store server — indexes Confluence, SharePoint or Google Drive and exposes search and retrieval tools. The assistant can fetch the exact policy document, product spec or onboarding guide the employee needs.
- CRM server — reads and writes customer records. The sales team can ask "what is the status of the Müller GmbH deal?" and get a real answer, not a guess.
- Ticketing server — integrates with Jira or Zendesk. The assistant can look up open tickets, create new ones, and even escalate issues.
- Calendar and availability server — allows the assistant to check schedules and propose meeting times.
- Custom internal API server — for any proprietary internal tooling or databases unique to your business.
Real-world results
For a mid-sized consulting firm we deployed this stack in eight weeks. The assistant had access to four MCP servers: the company's Confluence knowledge base, their CRM (HubSpot), their project management tool (Asana), and their HR system for leave and policy queries.
- 14 hours per week saved per employee on average across the pilot group, primarily on information look-up and routine admin tasks
- 73% reduction in internal Slack messages of the type "does anyone know where X is?"
- Support ticket volume for internal IT dropped by 41% in the first month as employees got immediate, accurate answers from the assistant instead
- Onboarding time for new hires cut from three weeks to ten days because the assistant could answer procedural questions on demand
The biggest surprise for most clients is not the efficiency gain — it's the shift in how employees feel about information access. When the answer to any question is thirty seconds away, people stop hoarding knowledge and start building on each other's work.
Security considerations
AI assistants with real access to your systems require serious security design. The permissions the chatbot holds are the permissions it can exercise — and that Claude can exercise on behalf of anyone who talks to it. We apply the following principles in every deployment:
- Least-privilege by default: each MCP server exposes only the specific read/write operations the assistant actually needs. It can fetch a customer record, but it cannot delete one.
- User-scoped auth: the assistant authenticates on behalf of the signed-in user, so it can only access what that user can access. No elevation of privilege.
- Audit logging: every tool call is logged to CloudWatch with the user identity, timestamp, tool name, inputs and outputs. Full traceability.
- PII redaction: any personally identifiable information retrieved from tools is stripped from conversation logs in transit.
- Rate limiting and anomaly detection: unusual usage patterns trigger alerts before they become incidents.
Is this right for your company?
This architecture delivers the most value when at least two of the following are true for your company:
- Your team spends significant time searching for internal information across multiple systems
- You have repetitive admin workflows that currently require a human to look something up and then act on it
- Knowledge is siloed across departments and the cost of that silos shows up in slow decisions or duplicated work
- You are on AWS or willing to use it (Bedrock is AWS-native)
If you're curious whether this fits your situation, we offer a free initial consultation where we map your tooling against a prototype architecture and give you an honest assessment of effort and expected return.
Want professionals to handle this for you?
Our AI Automation & Business Automation team handles everything — from strategy to execution.
Ready to take your business to the next level?
Let's get started