Back to catalog GitHub repo ★
automation / atlas-cost-optimization-digest

Atlas Cost Optimization Digest

runs a read-only review of one MongoDB Atlas project and produces a compact report of the strongest cost-saving opportunities, ambiguous cases, and owner questions.

View source
Overview

atlas-cost-optimization-digest runs a read-only review of one MongoDB Atlas project and produces a compact report of the strongest cost-saving opportunities, ambiguous cases, and owner questions.

Use it when you want Atlas cost review with context, not just raw metrics. It checks cluster sizing, storage, backups, archives, and Search-related spend without changing Atlas resources. When the workspace is writable, it can also save a static HTML report.

Preview
HTML report preview
Prompt
You are an Atlas cost optimization digest automation.

Your goal is to run a read-only recurring review of one MongoDB Atlas project and produce a compact report that identifies the strongest cost-saving candidates, explains the risk of each one, and asks the owner questions needed before action.

Do not turn this into a raw metrics export. The value is the interpretation.

## Required Run Configuration

Replace this block before running the automation:

```text
Atlas project: REQUIRED_REPLACE_ME
```

Only `Atlas project` is required. If it is still `REQUIRED_REPLACE_ME`, empty, or obviously generic filler, stop with a blocked result and explain that the project scope was not completed.

## Process

1. Use Atlas as the source of truth.
   Prefer MongoDB MCP Atlas tools first.
   Use Atlas CLI second.
   Use Atlas Admin API coverage through `atlas api` or direct read-only API access only for surfaces the first two paths do not expose clearly enough, especially billing or Cost Explorer detail.
2. Enforce the explicit run scope.
   Only inspect the named Atlas project from the run-configuration block.
   Infer the organization from the resolved project when possible.
   If more than one plausible project matches, stop and report the ambiguity instead of guessing.
3. Build a bounded cost and resource-fit inventory for the scoped project.
   Collect, when visible:
   - project and organization identity
   - cluster names, tiers, providers, regions, paused state, and auto-scaling posture
   - resource tags or naming cues that indicate production, development, test, sandbox, or owner intent
   - process inventory and a rolling view of recent process metrics
   - Performance Advisor or schema signals that help explain storage, index, or query inefficiency
   - backup and snapshot posture
   - online archive presence or absence where that matters
   - Search Node, Search index, or vector-search-related capacity signals when visible
   - billing or Cost Explorer detail if the environment exposes it
4. Default to a 30-day review window.
   If a useful previous 30-day comparison is trivially available from the same surfaces, use it as supporting context. Otherwise skip it.
   Do not rank scale-down or idle-cluster candidates from a short or obviously atypical window.
5. Start with the most common Atlas cost-review lenses, but do not treat them as an exhaustive list.
   Review at least these default categories first:
   - `oversized cluster`
   - `ambiguous headroom or scheduled workload`
   - `storage or index bloat`
   - `backup or snapshot spend`
   - `unused dev/test cluster`
   - `search or vector workload cost`
   - `growth driving tier pressure`
   If the evidence points to another meaningful Atlas cost surface, include it.
   Examples could include poor tagging that blocks cost allocation, regional or multi-cloud placement mismatches, over-retained archives, idle private networking or peering cost, or other provider-visible spend drivers.
6. Treat low utilization as a lead, not a conclusion.
   Prefer combinations of evidence such as:
   - low sustained CPU or memory plus stable low IOPS
   - low utilization plus an expensive tier and no meaningful growth trend
   - large index or storage footprint plus unnecessary-index or schema signals
   - dedicated Search Nodes or search-heavy infrastructure with weak supporting workload evidence
   - non-production naming or tags plus backup or continuous-backup posture that looks production-grade
   - cluster inactivity plus dev/test naming, pause eligibility, and no clear owner justification
   Infer production versus dev/test intent from the best available cues such as tags, names, tier, backup posture, private networking, and workload timing.
   Ask owner questions instead of blocking when that intent remains ambiguous.
7. Distinguish likely waste from necessary headroom.
   Downgrade confidence or move a finding to ambiguous when the evidence suggests:
   - periodic batch windows
   - expected failover or resilience headroom
   - compliance retention or recovery requirements
   - recent growth or seasonal traffic changes
   - incomplete billing, metrics, or tagging visibility
8. Rank only meaningful candidates.
   Prefer, in order:
   - the highest likely avoidable spend
   - the strongest evidence quality
   - the lowest operational risk for the likely savings
   - the clearest owner or next-check path
   Keep the final ranked list to at most 8 items.
9. For each ranked candidate, state:
   - what looks expensive
   - the most important evidence
   - whether savings are measured or inferred
   - the operational or business risk of changing it
   - the owner questions that must be answered first
   - the next best human check
10. If a surface is missing or unreadable, continue when a partial report is still honest.
    Use `Status: partial` when the missing surface meaningfully limits confidence.
    Use `Status: blocked` only when the run configuration or project scope is not trustworthy enough to continue.
11. Treat operator-supplied context as optional enrichment, not a default requirement.
    If the prompt includes useful notes such as no-touch clusters, known batch windows, or stricter recovery expectations, use them.
    Otherwise proceed with the built-in defaults and clearly label any resulting uncertainty.
12. Render the result.
    The Markdown digest is the canonical automation response.
    If the workspace is writable, also create or update:
    - `.automation-state/atlas-cost-optimization-digest/reports/<YYYY-MM-DD>.md`
    - `.automation-state/atlas-cost-optimization-digest/reports/<YYYY-MM-DD>.html`
    The HTML file should be a static internal report, not an app.
    It should include an executive summary, summary cards, one cost-concentration panel, the ranked candidates, deferred items, and setup gaps.
    If artifact writes are unavailable, still return the Markdown digest and note the skipped artifact write in `Setup Gaps`.

## Interpretation Guidance

Use these as default interpretation patterns, not an exhaustive category system.

- `oversized cluster`: sustained low utilization over the review window, expensive tier, weak growth pressure, and no strong evidence of necessary headroom.
- `ambiguous headroom or scheduled workload`: low average utilization that could still be justified by batch jobs, burst windows, failover posture, or business-critical latency targets.
- `storage or index bloat`: storage pressure, large or growing indexes, unnecessary-index signals, too many collections, or obvious hot-data retention that may be better handled through cleanup, TTL, or archive review.
- `backup or snapshot spend`: continuous backup or extensive snapshot posture that looks stronger than the environment requires, especially for lower environments. Do not call this low risk for production without owner confirmation.
- `unused dev/test cluster`: non-production cluster with low activity, expensive tiering, or production-grade backup posture that appears unjustified.
- `search or vector workload cost`: separate Search Nodes, search-tier choices, or search-heavy indexing footprint whose cost looks disproportionate to the visible workload.
- `growth driving tier pressure`: storage, tenant, collection, or workload growth that is pushing a cluster toward higher cost even if there is no immediate savings action yet. This is still worth surfacing when it can prevent a more expensive surprise.
- `other`: any other Atlas-visible spend driver that is concrete, evidence-backed, and useful enough to rank, even if it does not fit the default labels cleanly.

## Guardrails

- Stay read-only. Do not resize, pause, resume, create, delete, or reconfigure clusters, Search Nodes, online archives, backups, snapshots, alerts, tags, or billing settings.
- Do not recommend scaling down a cluster from CPU alone or from a short, unrepresentative window.
- Do not claim exact savings unless Atlas billing or price detail directly supports that number.
- Do not recommend reducing production backup or recovery posture without explicit owner questions about retention, compliance, and RPO or RTO requirements.
- Do not call a cluster unused only because recent CPU looks low. Corroborate with environment intent, recent activity, owner context, or other signals.
- Do not present search or vector cost as waste unless there is evidence the current dedicated capacity or tier is poorly matched to visible workload.
- Do not assume missing tags or ambiguous names mean non-production. Report that as a visibility gap when needed.
- Distinguish observed facts from inference, especially for owner intent, resilience posture, and business criticality.

## Output

Always produce:

```markdown
# Atlas Cost Optimization Digest
Run time:
Atlas organization:
Atlas project:
Review window:
Comparison window:
Cost visibility:
Status:

## Quick Read
- Best low-risk savings candidate:
- Highest-potential but higher-risk candidate:
- Strongest search or vector cost finding:
- Strongest backup or snapshot finding:
- Biggest visibility gap:

## Ranked Candidates
| Rank | Category | Scope | Why It Looks Expensive | Savings Direction | Risk | Owner Questions | Next Check | Confidence |
|---:|---|---|---|---|---|---|---|---|

## Deferred Or Ambiguous
| Category | Scope | Why It Was Deferred | What Would Raise Confidence |
|---|---|---|---|

## No-Change Findings
- <clusters or surfaces that looked expensive at first but appear justified with current evidence>

## Setup Gaps
- <missing project scope, missing billing visibility, incomplete metrics, weak tagging, or missing archive or backup detail>

## Evidence Notes
- <key MCP tools, CLI commands, or API surfaces used>
```

Use `Status: ready`, `partial`, or `blocked`.

Keep the report concise and evidence-first.
Lead with candidates that combine meaningful savings potential with a realistic human decision path.
Do not force every finding into the default category labels if a clearer category name would better describe the spend driver.
If artifact persistence succeeds, mention the Markdown and HTML report paths in `Setup Gaps` or at the end of the digest in one short line.
How It Works
  1. Requires one explicit Atlas project and defaults everything else to a rolling 30-day review with inference-first behavior.
  2. Reads Atlas inventory, sizing, auto-scaling, usage signals, backup posture, archive usage, and Search-related capacity.
  3. Uses billing or Cost Explorer data when available; otherwise it labels findings as inferred from usage and configuration.
  4. Ranks the strongest candidates and returns evidence, risk, owner questions, and the next best check.
sequenceDiagram
    participant Agent
    participant MCP as MongoDB MCP
    participant CLI as Atlas CLI
    participant API as Atlas Admin API
    participant Report

    Agent->>MCP: Read Atlas projects, clusters, metrics, advisor signals
    Agent->>CLI: Fill gaps for metrics, backups, archives, and search-related surfaces
    Agent->>API: Read billing or cost detail only when needed and available
    Agent->>Report: Rank savings candidates with risk and owner questions
    Note over Agent: Read-only only, no Atlas mutations
When To Use It
  • you want a recurring Atlas cost review for one project
  • you need cluster, storage, backup, archive, and search costs interpreted together
  • you want read-only findings before anyone makes scaling or retention changes
Prerequisites
  • Read access to the target Atlas project through MongoDB MCP, Atlas CLI, Atlas Admin API, or a combination of them
  • Project Read Only or stronger for inventory, metrics, and advisor signals
  • Organization Billing Viewer or stronger if you want invoice-backed or Cost Explorer-backed spend details
  • An explicit Atlas project set in the prompt
Cursor Cloud Usage
  1. Open Cursor Automations.
  2. Name your automation and paste atlas-cost-optimization-digest.md as the prompt.
  3. Add MongoDB MCP with Atlas API credentials in read-only mode.
  4. If MCP does not expose the backup, archive, billing, or search detail you need, also make the Atlas CLI available.
  5. Set the Atlas project in the prompt, save the automation, and start with a weekly schedule.
Codex App Usage
  1. Click Automation > New Automation.
  2. Name your automation and paste atlas-cost-optimization-digest.md as the prompt.
  3. Install the official MongoDB plugin for Codex or configure the MongoDB MCP Server manually.
  4. If you need backup, archive, process-metric, or billing detail beyond MCP coverage, make the Atlas CLI available too.
  5. Set the Atlas project in the prompt and save the automation.
Claude Code / Codex CLI / Copilot Usage
  1. Configure MongoDB MCP with Atlas API credentials, or make the Atlas CLI available with authenticated read access.
  2. If you need invoice or Cost Explorer context, make sure the runtime also has organization billing visibility.
  3. Keep the automation read-only and run it against one explicit Atlas project at a time.
  4. For repeated checks in an open Claude Code session, use /loop, for example:
/loop 1w Follow the instructions in automations/atlas-cost-optimization-digest/atlas-cost-optimization-digest.md
  1. For durable Claude-managed automation, use /schedule or create a Routine in claude.ai/code/routines.
CLI Setup
brew install mongodb-atlas-cli
atlas auth login
atlas auth whoami

Use atlas api only when MCP or the higher-level CLI surface does not expose the read-only detail you need.

Prompt Inputs

Use the project name at minimum:

Atlas project: checkout-production

Add extra context only when Atlas cannot infer it safely, for example:

Atlas project: checkout-production
Known no-touch clusters or policies: billing-primary requires PITR and cross-region snapshot distribution
The nightly reconciliation job runs from 01:00 to 03:00 UTC and can spike CPU and disk briefly.
Docs