AI Landscape & Key Players
The Big Picture
The AI ecosystem has distinct layers. Understanding who does what helps you choose the right tools.
┌─────────────────────────────────────────────────────────────┐
│ APPLICATIONS │
│ Claude Code, Cursor, ChatGPT, Copilot, Perplexity │
├─────────────────────────────────────────────────────────────┤
│ FRAMEWORKS │
│ LangChain, LangGraph, LlamaIndex, CrewAI, Agent SDK │
├─────────────────────────────────────────────────────────────┤
│ INFRASTRUCTURE │
│ Hugging Face, Vector DBs, W&B, Replicate │
├─────────────────────────────────────────────────────────────┤
│ CLOUD PLATFORMS │
│ AWS Bedrock, Google Vertex AI, Azure OpenAI │
├─────────────────────────────────────────────────────────────┤
│ FOUNDATION MODELS │
│ Claude, GPT, Gemini, Llama, Mistral, DeepSeek, Grok │
└─────────────────────────────────────────────────────────────┘
Foundation Model Providers
These companies build the core LLMs that power everything else.
| Company | Model Family | Key Strengths | Open Source? |
|---|
| Anthropic | Claude (Opus 4, Sonnet 4, Haiku 3.5) | Safety, coding, long context (200K), agentic use | No |
| OpenAI | GPT (GPT-4o, o1, o3, GPT-4.1) | Ecosystem, multimodal, reasoning models | No |
| Google | Gemini (2.5 Pro, 2.5 Flash) | Massive context (1M+), multimodal, search integration | No |
| Meta | Llama (3.1, 4) | Best open-source models, run locally | Yes |
| Mistral | Mistral (Large, Medium, Small) | European, strong efficiency, open weights | Partially |
| xAI | Grok (3) | Real-time X/Twitter data access | No |
| DeepSeek | DeepSeek (V3, R1) | Open-source reasoning model, strong on math/code | Yes |
Reasoning Models (New Category)
A major trend: models that "think" before answering, spending compute on step-by-step reasoning.
| Model | Provider | Key Trait |
|---|
| Claude Opus 4 | Anthropic | Extended thinking with visible reasoning traces |
| o1 / o3 | OpenAI | Chain-of-thought reasoning, excels at math and logic |
| DeepSeek R1 | DeepSeek | Open-source reasoning model |
| Gemini 2.5 Pro | Google | Built-in "thinking" mode |
Cloud AI Platforms
Run multiple models through a single cloud provider:
| Platform | Provider | What It Offers |
|---|
| AWS Bedrock | Amazon | Access Claude, Llama, Mistral, and more via AWS |
| Google Vertex AI | Google | Gemini models + third-party models + MLOps tools |
| Azure OpenAI | Microsoft | GPT models with enterprise Azure security/compliance |
When to use these: Enterprise teams that need compliance, VPC deployment, or multi-model access through one billing relationship.
AI Coding Tools
| Tool | How It Works | Best For |
|---|
| Claude Code | CLI agent: reads files, writes code, runs commands, iterates | Agentic coding, complex refactors, full autonomy |
| GitHub Copilot | IDE autocomplete + chat, powered by GPT/Claude | Inline suggestions while typing |
| Cursor | VS Code fork with deep AI integration | AI-first IDE experience |
| Windsurf | IDE with "Cascade" multi-step AI flows | Agentic IDE with context awareness |
| Cline | Open-source VS Code extension for agentic coding | Customizable, open ecosystem |
How They Compare
| Feature | Claude Code | Copilot | Cursor | Windsurf |
|---|
| Interface | Terminal/CLI | IDE plugin | Full IDE | Full IDE |
| Autonomy | High (agent loop) | Low (suggestions) | Medium | Medium-High |
| Tool use | Full (file ops, git, shell) | Limited | Moderate | Moderate |
| Model | Claude | GPT/Claude (configurable) | Multiple | Multiple |
| Open source | SDK is open | No | No | No |
AI Infrastructure
| Tool | Category | What It Does |
|---|
| Hugging Face | Model hub | Host, share, and discover models and datasets |
| Weights & Biases | Experiment tracking | Log metrics, compare runs, track model performance |
| Replicate | Model hosting | Run open-source models via API without infra |
| Modal | Serverless GPU | Run GPU workloads on demand (fine-tuning, inference) |
| Together AI | Inference | Fast, cheap inference for open-source models |
| Groq | Hardware inference | Ultra-fast inference on custom LPU chips |
Vector Databases
Essential for RAG systems (see 17 - RAG (Retrieval-Augmented Generation)):
| Database | Type | Differentiator |
|---|
| Pinecone | Managed cloud | Simplest to start, scales to billions |
| Weaviate | Cloud / self-hosted | Built-in hybrid search |
| Qdrant | Cloud / self-hosted | Rich filtering, high performance |
| Chroma | Local / embedded | Great for prototyping, simple API |
| pgvector | PostgreSQL extension | Use your existing Postgres, no new infra |
| Milvus | Self-hosted | Enterprise scale, open source |
Agent Frameworks
| Framework | Description | Backed By |
|---|
| Claude Agent SDK | Official SDK for building agents with Claude | Anthropic |
| LangChain | Most popular framework, huge integration ecosystem | LangChain Inc. |
| LangGraph | Graph-based agent workflows with state management | LangChain Inc. |
| CrewAI | Multi-agent role-based collaboration | CrewAI |
| AutoGen | Multi-agent conversation framework | Microsoft |
Model Comparison Table
As of early 2026 (pricing and capabilities change frequently):
| Claude Opus 4 | Claude Sonnet 4 | GPT-4o | Gemini 2.5 Pro | Llama 4 |
|---|
| Context | 200K | 200K | 128K | 1M+ | 128K |
| Input $/1M | $15 | $3 | $2.50 | $1.25 | Free (self-host) |
| Output $/1M | $75 | $15 | $10 | $10 | Free (self-host) |
| Coding | Excellent | Excellent | Very Good | Very Good | Good |
| Reasoning | Excellent (extended thinking) | Very Good | Good (use o3 for reasoning) | Very Good (thinking mode) | Good |
| Multimodal | Vision | Vision | Vision + Audio | Vision + Audio + Video | Vision |
| Open Source | No | No | No | No | Yes |
| Best For | Deep reasoning, complex tasks | Daily coding, agents | General purpose, ecosystem | Long context, multimodal | Local deployment, privacy |
How to Choose
Need best coding agent? → Claude Sonnet 4 or Claude Code
Need deep reasoning? → Claude Opus 4 or OpenAI o3
Need 1M+ context? → Gemini 2.5 Pro
Need to run locally / privacy? → Llama 4 or DeepSeek V3
Need cheapest API? → Claude Haiku 3.5 or GPT-4o-mini
Need real-time search? → Perplexity or Grok
Key Trends (2025-2026)
| Trend | What's Happening |
|---|
| Reasoning models | o1/o3, DeepSeek R1, Claude extended thinking — models that "think harder" |
| Agentic AI | Models that use tools, plan, and execute multi-step tasks autonomously |
| Longer contexts | 1M+ tokens (Gemini), making RAG less necessary for some use cases |
| Cheaper inference | Prices dropping 5-10x per year, making AI accessible for more use cases |
| Open source catching up | Llama 4, DeepSeek V3, Qwen 2.5 approaching proprietary model quality |
| Multimodal | Vision, audio, video as standard inputs, not special features |
| On-device AI | Apple Intelligence, Gemini Nano, Llama on phones and laptops |
| MCP adoption | Standardized tool interface spreading across the ecosystem |
| Coding agents | Claude Code, Copilot Workspace, Devin — agents that write and ship code |
Resources
Previous: 18 - Evaluations & Testing AI | Next: 20 - AI News & Updates