Temporal Stripe Integration: Restricted API Keys, Spend Caps, and Agent Governance

Temporal's durable execution model is the most reliable foundation for AI agent workflows that touch money. The same durability features — automatic activity retries, workflow replay-based recovery, and child workflow fan-out — are the ones that silently create duplicate Stripe charges when billing activities fail at the wrong moment.

Temporal is a durable workflow platform that makes long-running processes resilient by replaying workflow code from event history on worker failure, retrying failed activities according to a configurable RetryPolicy, and persisting workflow state across process restarts. For AI agents handling Stripe billing — subscription renewals, usage-based invoicing, metered payment processing — Temporal's durability is a natural fit. You don't want a billing workflow to simply disappear when a worker crashes mid-charge. The challenge is that Temporal's retry and replay semantics operate at the infrastructure layer and have no awareness of Stripe's charge semantics. A Stripe charge that completes successfully is not rolled back when Temporal retries the containing activity. Temporal's event-sourced replay faithfully re-executes any code placed in the wrong context. These two properties, combined, produce silent double billing in patterns that otherwise look correct.

This post covers three failure modes specific to Temporal's architecture — activity retry, workflow replay violation, and child workflow fan-out — with Python SDK code for each, and the two-layer governance pattern that closes all three.

Failure mode 1: Activity retry policy fires a duplicate charge

Temporal activities carry a RetryPolicy that is applied automatically when an activity raises an exception. The default policy retries indefinitely with exponential backoff starting at one second. This is the right default for most network-dependent operations: if a database write times out, you want the activity to retry until it succeeds. For Stripe billing, the default retry semantics are dangerous.

The failure pattern: a charge_stripe activity calls stripe.charges.create(). The Stripe API call succeeds — the charge is created and a charge ID is returned. The activity then attempts to write the charge record to a database. The database write raises a connection error. Temporal receives the exception, marks the activity as failed, waits one second, and re-runs the activity. The activity calls stripe.charges.create() again with the same parameters. Because no idempotency key was provided in the original call, Stripe treats this as a new request and creates a second charge. The customer is billed twice.

# billing_activities.py — UNSAFE: no idempotency key, default retry policy
import stripe
from temporalio import activity

stripe.api_key = "sk_live_..."  # Bare production key — full scope

@activity.defn
async def charge_stripe(
    customer_id: str, amount_cents: int, billing_period: str
) -> dict:
    # Stripe charge succeeds — charge object created
    charge = stripe.charges.create(
        amount=amount_cents,
        currency="usd",
        customer=customer_id,
        description=f"Subscription {billing_period}",
        # No idempotency_key — every retry = new Stripe charge
    )

    # If this raises, Temporal retries the entire activity from the top
    # stripe.charges.create() runs again → second charge, no dedup
    write_charge_to_db(customer_id, charge["id"], billing_period)

    return {"charge_id": charge["id"], "status": "succeeded"}

When write_charge_to_db raises a DatabaseConnectionError, Temporal retries charge_stripe. The second invocation calls stripe.charges.create() again with identical parameters. Stripe sees a new request and creates charge ch_B. The customer has been charged twice for the same billing period — and both charges appear legitimate in the Stripe dashboard because both completed successfully.

The fix adds a content-hash idempotency key derived from the stable billing parameters, and separates the database write error from the charge result so Temporal can retry the write independently:

# billing_activities.py — SAFE: content-hash idempotency key + separated write
import hashlib
import stripe
from temporalio import activity
from temporalio.common import RetryPolicy
from datetime import timedelta

def make_charge_idempotency_key(
    customer_id: str, amount_cents: int, billing_period: str
) -> str:
    payload = f"{customer_id}:{amount_cents}:{billing_period}:temporal-billing"
    return hashlib.sha256(payload.encode()).hexdigest()[:36]

@activity.defn
async def charge_stripe(
    customer_id: str, amount_cents: int, billing_period: str
) -> dict:
    idempotency_key = make_charge_idempotency_key(
        customer_id, amount_cents, billing_period
    )

    # Stripe returns the original charge on any retry with the same key
    charge = stripe.charges.create(
        amount=amount_cents,
        currency="usd",
        customer=customer_id,
        description=f"Subscription {billing_period}",
        idempotency_key=idempotency_key,
    )
    return {"charge_id": charge["id"], "status": "succeeded"}

@activity.defn
async def write_charge_record(
    customer_id: str, charge_id: str, billing_period: str
) -> dict:
    # DB write is a separate activity — retries here don't re-fire Stripe
    write_charge_to_db(customer_id, charge_id, billing_period)
    return {"written": True}

# workflow.py — two activities: charge, then record
@workflow.defn
class BillingWorkflow:
    @workflow.run
    async def run(self, customer_id: str, amount_cents: int, billing_period: str):
        charge_result = await workflow.execute_activity(
            charge_stripe,
            args=[customer_id, amount_cents, billing_period],
            start_to_close_timeout=timedelta(seconds=30),
            retry_policy=RetryPolicy(maximum_attempts=5),
        )
        await workflow.execute_activity(
            write_charge_record,
            args=[customer_id, charge_result["charge_id"], billing_period],
            start_to_close_timeout=timedelta(seconds=10),
            retry_policy=RetryPolicy(maximum_attempts=10),
        )

The idempotency key is a SHA-256 hash of customer_id + amount_cents + billing_period — stable across all activity retries because those values don't change between attempts. Stripe returns the original ch_xxx object on every retry. The database write is now a separate activity with its own retry policy — if the write fails, only the write retries, never the charge. The activities can have different maximum_attempts settings: five for the charge (which is safe to retry with idempotency), ten for the write (which is safe to retry repeatedly).

Failure mode 2: Stripe calls placed in workflow code re-execute on every replay

Temporal's durability model relies on a fundamental constraint: workflow code must be deterministic and side-effect-free. A Temporal workflow is replayed from its event history on every worker restart. When a worker picks up a workflow after a crash, it re-runs the workflow function from the beginning, stepping through the event history to reconstruct state. Any code in the workflow function body that is not an await workflow.execute_activity() call executes again on every replay.

The failure pattern: a developer places a Stripe API call directly inside the @workflow.defn function instead of wrapping it in an @activity.defn. This looks superficially correct — the charge runs, the workflow continues. The first run succeeds. On the next worker restart — due to a deploy, a crash, or a Temporal Cloud namespace rebalance — the workflow is replayed. The stripe.charges.create() call in the workflow body executes again. Temporal's event history records the Stripe call result from the first run as a workflow event, but the new call creates a new charge before Temporal can compare the result against history. The customer is charged on every worker restart.

# workflow.py — UNSAFE: Stripe call in workflow body, not in an activity
import stripe
from temporalio import workflow

@workflow.defn
class BillingWorkflow:
    @workflow.run
    async def run(self, customer_id: str, amount_cents: int, billing_period: str):
        # WRONG: Direct Stripe call in workflow code
        # This re-executes on every Temporal replay
        charge = stripe.charges.create(
            amount=amount_cents,
            currency="usd",
            customer=customer_id,
            description=f"Subscription {billing_period}",
        )
        # The charge fires again on every worker restart / workflow replay
        charge_id = charge["id"]

        # Continue processing with charge_id...
        await workflow.execute_activity(
            send_receipt,
            args=[customer_id, charge_id],
            start_to_close_timeout=timedelta(seconds=10),
        )

Temporal will log a non-determinism error when it detects that the workflow produces different results across replays (the charge ID changes each time). But by the time this error surfaces, multiple charges may already have fired. The Temporal SDK does attempt to detect workflow non-determinism, but the detection happens at the replay boundary — after the API call has already executed.

The fix is unambiguous: all Stripe API calls must live in @activity.defn functions. Activities are executed exactly once per invocation according to the event history — Temporal does not re-run an activity that already has a completion event in the history:

# workflow.py — SAFE: Stripe call wrapped in activity
from temporalio import workflow, activity
from datetime import timedelta

@activity.defn
async def charge_stripe_activity(
    customer_id: str, amount_cents: int, billing_period: str
) -> dict:
    # Activity functions are the correct place for Stripe calls
    # Temporal executes this exactly once per workflow invocation
    idempotency_key = make_charge_idempotency_key(
        customer_id, amount_cents, billing_period
    )
    charge = stripe.charges.create(
        amount=amount_cents,
        currency="usd",
        customer=customer_id,
        description=f"Subscription {billing_period}",
        idempotency_key=idempotency_key,
    )
    return {"charge_id": charge["id"]}

@workflow.defn
class BillingWorkflow:
    @workflow.run
    async def run(self, customer_id: str, amount_cents: int, billing_period: str):
        # Workflow code: only awaitable Temporal SDK calls, no side effects
        result = await workflow.execute_activity(
            charge_stripe_activity,
            args=[customer_id, amount_cents, billing_period],
            start_to_close_timeout=timedelta(seconds=30),
        )
        await workflow.execute_activity(
            send_receipt,
            args=[customer_id, result["charge_id"]],
            start_to_close_timeout=timedelta(seconds=10),
        )

When Temporal replays this workflow after a worker restart, it finds the charge_stripe_activity completion event in the history and skips re-executing the activity — returning the stored result directly. The Stripe API is never called again. This is Temporal's core safety guarantee, but it only applies to code placed inside execute_activity(), never to code in the workflow function body.

Failure mode 3: Child workflow fan-out re-bills on parent retry

A common Temporal pattern for batch billing — charging every customer in a subscription tier at the start of a billing cycle — uses child workflows. The parent workflow iterates a list of customers and calls execute_child_workflow() for each one. Each child workflow handles one customer's charge, receipt, and record. The parent workflow completes when all children complete. This architecture is clean, observable, and idiomatic Temporal.

The failure emerges when the parent workflow is interrupted and retried. Consider a batch billing job for 400 customers. The parent workflow starts child workflows for customers 0–399. After the first 200 children complete, the Temporal worker crashes. When the worker restarts, Temporal replays the parent workflow. For the first 200 customers, Temporal finds completed child workflow execution events in the history and skips re-starting those children. For customers 201–399, Temporal re-dispatches the child workflows — correctly, since those children never completed. The children for customers 201–399 call stripe.charges.create() with no idempotency key. All 199 charges fire correctly.

Now consider the more dangerous scenario: the parent workflow is retried by the caller — not by Temporal's internal recovery — using a new workflow_id. This happens when an operator sees the original workflow in a TERMINATED state and manually re-runs the billing job. Temporal sees a new workflow execution. No prior history exists. The parent dispatches child workflows for all 400 customers. All 400 children call Stripe. The 200 customers who were charged in the first run are charged again.

# billing_workflows.py — UNSAFE: child workflows without idempotency keys
from temporalio import workflow, activity
from datetime import timedelta

@activity.defn
async def charge_customer_activity(customer_id: str, amount_cents: int, billing_period: str) -> dict:
    # No idempotency key — each new child workflow invocation creates a new charge
    charge = stripe.charges.create(
        amount=amount_cents,
        currency="usd",
        customer=customer_id,
        description=f"Subscription {billing_period}",
    )
    return {"charge_id": charge["id"]}

@workflow.defn
class CustomerBillingWorkflow:
    @workflow.run
    async def run(self, customer_id: str, amount_cents: int, billing_period: str) -> dict:
        return await workflow.execute_activity(
            charge_customer_activity,
            args=[customer_id, amount_cents, billing_period],
            start_to_close_timeout=timedelta(seconds=30),
        )

@workflow.defn
class BatchBillingWorkflow:
    @workflow.run
    async def run(self, customers: list, billing_period: str) -> list:
        results = []
        for customer in customers:
            # Each child workflow has its own execution scope
            # If the parent is re-run (new workflow_id), all children re-fire
            result = await workflow.execute_child_workflow(
                CustomerBillingWorkflow.run,
                args=[customer["id"], customer["amount_cents"], billing_period],
                id=f"billing-{customer['id']}-{billing_period}",  # deterministic ID
            )
            results.append(result)
        return results

The child workflow ID billing-{customer_id}-{billing_period} is deterministic — this protects against duplicate child workflow dispatches within the same parent execution. But it does not protect against a new parent execution re-dispatching the same child workflow IDs (Temporal raises WorkflowAlreadyStartedError for still-running children, but completed children can be re-started under the same ID in a new execution context). The fix combines two defenses: idempotency keys in the charge activity (so Stripe deduplicates even if the charge activity runs twice), and a pre-charge existence check that short-circuits the activity if the billing period is already recorded:

# billing_workflows.py — SAFE: idempotency key + pre-charge guard
import hashlib
import stripe
from temporalio import workflow, activity
from datetime import timedelta

def make_charge_idempotency_key(
    customer_id: str, amount_cents: int, billing_period: str
) -> str:
    payload = f"{customer_id}:{amount_cents}:{billing_period}:temporal-billing"
    return hashlib.sha256(payload.encode()).hexdigest()[:36]

@activity.defn
async def check_existing_charge(customer_id: str, billing_period: str) -> dict:
    # Uses audit vault key (GET /v1/charges only) — read-only, safe to retry
    existing = get_charge_from_db(customer_id, billing_period)
    if existing:
        return {"charge_id": existing["charge_id"], "already_charged": True}
    return {"already_charged": False}

@activity.defn
async def charge_customer_activity(
    customer_id: str, amount_cents: int, billing_period: str
) -> dict:
    idempotency_key = make_charge_idempotency_key(
        customer_id, amount_cents, billing_period
    )
    charge = stripe.charges.create(
        amount=amount_cents,
        currency="usd",
        customer=customer_id,
        description=f"Subscription {billing_period}",
        idempotency_key=idempotency_key,
    )
    write_charge_to_db(customer_id, charge["id"], billing_period)
    return {"charge_id": charge["id"], "already_charged": False}

@workflow.defn
class CustomerBillingWorkflow:
    @workflow.run
    async def run(self, customer_id: str, amount_cents: int, billing_period: str) -> dict:
        # Pre-flight check before Stripe call — catches re-run scenarios
        existing = await workflow.execute_activity(
            check_existing_charge,
            args=[customer_id, billing_period],
            start_to_close_timeout=timedelta(seconds=10),
        )
        if existing["already_charged"]:
            return existing  # Short-circuit: already billed this period

        return await workflow.execute_activity(
            charge_customer_activity,
            args=[customer_id, amount_cents, billing_period],
            start_to_close_timeout=timedelta(seconds=30),
            retry_policy=RetryPolicy(maximum_attempts=5),
        )

The check_existing_charge activity uses an audit vault key scoped to GET /v1/charges only — it can never create charges, so it is safe to retry without limit. If the customer was already billed for this period (in a prior execution of the same workflow, or in a prior run of the batch job), the child workflow short-circuits before touching Stripe. The idempotency key in charge_customer_activity provides a second layer of protection at the Stripe API level for cases where the DB record hasn't been written yet when the re-run occurs.

The complete governance pattern

All three failure modes share a common root: Temporal's retry and replay semantics operate transparently at the infrastructure layer, below the Stripe API call. The two-layer governance pattern addresses both the Stripe API layer and the key-scoping layer:

Layer Mechanism What it prevents
Idempotency key SHA-256 hash of customer_id + amount_cents + billing_period + "temporal-billing" Duplicate charges on activity retry; charges on workflow re-run
Activity separation All Stripe calls inside @activity.defn, never in workflow body Re-execution of Stripe calls during Temporal workflow replay
Pre-charge guard check_existing_charge activity reads DB before calling Stripe Re-billing on parent workflow re-run or operator retry
Vault key scoping Billing vault key: POST /v1/charges only, daily USD cap. Audit vault key: GET /v1/charges only. Runaway retry loops from exceeding daily spend cap; accidental charge from audit key
Proxy routing stripe.StripeClient(base_url="https://proxy.keybrake.com/stripe") Enforces vault key scope and spend cap at network layer

Routing through a proxy adds one line of configuration in the Temporal activity:

# billing_activities.py — vault key via Keybrake proxy
import stripe
from temporalio import activity

@activity.defn
async def charge_stripe_activity(
    customer_id: str, amount_cents: int, billing_period: str
) -> dict:
    # Vault key scoped to POST /v1/charges only; daily cap = max expected batch
    billing_client = stripe.StripeClient(
        api_key=activity.info().heartbeat_details or os.environ["VAULT_KEY_BILLING"],
        base_url="https://proxy.keybrake.com/stripe",
    )

    idempotency_key = make_charge_idempotency_key(
        customer_id, amount_cents, billing_period
    )
    charge = billing_client.charges.create(
        params={
            "amount": amount_cents,
            "currency": "usd",
            "customer": customer_id,
            "description": f"Subscription {billing_period}",
        },
        options={"idempotency_key": idempotency_key},
    )
    return {"charge_id": charge.id}

@activity.defn
async def check_existing_charge(customer_id: str, billing_period: str) -> dict:
    # Audit vault key — GET /v1/charges only
    audit_client = stripe.StripeClient(
        api_key=os.environ["VAULT_KEY_AUDIT"],
        base_url="https://proxy.keybrake.com/stripe",
    )
    existing = get_charge_from_db(customer_id, billing_period)
    if existing:
        return {"charge_id": existing["charge_id"], "already_charged": True}
    return {"already_charged": False}

Governance comparison

Pattern Bare key risk Restricted key Idempotency key Spend cap Audit log Full governance
Temporal activity retry (no idem key) Duplicate charge per retry attempt Doesn't prevent duplicate charges Stripe deduplicates across all retries Caps retry blast radius Each attempt logged All three layers
Stripe call in workflow body New charge on every replay Doesn't prevent replay charges Partially helps; architecture fix required Caps replay blast radius Replay charges logged Move to activity + idem key
Child workflow fan-out re-run Full customer list re-billed Doesn't prevent re-billing Deduplicates at Stripe layer Caps per-run spend before batch exhausts Re-run charges logged Pre-charge guard + idem key + vault key

Enforcement with pytest

# tests/test_temporal_billing_governance.py
import hashlib
import pytest
from unittest.mock import patch, MagicMock

def make_idempotency_key(customer_id, amount_cents, billing_period):
    payload = f"{customer_id}:{amount_cents}:{billing_period}:temporal-billing"
    return hashlib.sha256(payload.encode()).hexdigest()[:36]

def test_idempotency_key_is_stable_across_retries():
    key1 = make_idempotency_key("cus_test", 9900, "2026-06")
    key2 = make_idempotency_key("cus_test", 9900, "2026-06")
    assert key1 == key2

def test_different_periods_produce_different_keys():
    key_june = make_idempotency_key("cus_test", 9900, "2026-06")
    key_july = make_idempotency_key("cus_test", 9900, "2026-07")
    assert key_june != key_july

@pytest.mark.asyncio
async def test_charge_activity_passes_idempotency_key_to_stripe():
    with patch("stripe.charges.create") as mock_create:
        mock_create.return_value = MagicMock(id="ch_test_123")
        from billing_activities import charge_stripe_activity
        result = await charge_stripe_activity("cus_test", 9900, "2026-06")
        call_kwargs = mock_create.call_args[1]
        assert "idempotency_key" in call_kwargs
        assert call_kwargs["idempotency_key"] == make_idempotency_key(
            "cus_test", 9900, "2026-06"
        )

@pytest.mark.asyncio
async def test_existing_charge_short_circuits_before_stripe():
    with patch("billing_activities.get_charge_from_db") as mock_db:
        mock_db.return_value = {"charge_id": "ch_already_done"}
        with patch("stripe.charges.create") as mock_stripe:
            from billing_workflows import CustomerBillingWorkflow
            # Simulate check_existing_charge returning already_charged=True
            existing = {"charge_id": "ch_already_done", "already_charged": True}
            mock_stripe.assert_not_called()
            assert existing["already_charged"] is True
            assert existing["charge_id"] == "ch_already_done"

def test_vault_key_env_vars_are_separate():
    import os
    billing_key = os.environ.get("VAULT_KEY_BILLING", "")
    audit_key = os.environ.get("VAULT_KEY_AUDIT", "")
    if billing_key and audit_key:
        assert billing_key != audit_key, "Billing and audit vault keys must be different"

Gap analysis

FAQ

Temporal deduplicates workflows by workflow ID — doesn't that protect me from duplicate charges?

Temporal's workflow ID uniqueness prevents two workflows with the same ID from running simultaneously in the same namespace. It does not prevent a completed workflow from being re-run under a new execution (the same workflow ID can be reused after a workflow terminates). It also doesn't prevent the activities within a workflow from being retried. The idempotency key operates at the Stripe API layer — it's the only mechanism that deduplicates charges regardless of how many times the activity runs.

Should I set maximum_attempts=1 for billing activities to prevent retries?

Disabling retries on billing activities creates a different problem: a transient Stripe API timeout (HTTP 503, connection reset) causes the billing workflow to fail permanently, requiring manual intervention for every network hiccup. With an idempotency key, you can safely allow retries — Stripe's idempotency guarantees that any number of retries with the same key return the same charge object. Set maximum_attempts to a small number (3–5) to prevent infinite retry storms, but don't set it to 1.

Can I use the Temporal workflow run ID as the idempotency key?

A workflow's run ID changes when the workflow is retried or re-run. Using the run ID as the idempotency key means a retry creates a new idempotency key — Stripe sees it as a new request and creates a new charge. Use stable business parameters (customer ID + amount + billing period) instead. These values are identical across all retries and re-runs of the same billing intent.

How does heartbeating interact with billing activities?

Temporal heartbeating (activity.heartbeat()) extends the activity's deadline and lets the activity pass state back to Temporal in case it's cancelled and retried on a different worker. For billing activities, you can use heartbeat details to pass the charge ID back to Temporal: after stripe.charges.create() succeeds, call activity.heartbeat(charge_id). If the activity is cancelled before completing, the next retry can check activity.info().heartbeat_details for the charge ID and skip the Stripe call entirely.

What happens to in-flight Stripe charges during a Temporal worker crash?

If the Temporal worker crashes while a stripe.charges.create() call is in-flight, Temporal's scheduler detects the lost heartbeat and schedules the activity on a new worker. The new worker starts the activity from the beginning. If Stripe completed the charge before the crash, Stripe recorded the charge. The new worker calls stripe.charges.create() again with the same idempotency key — Stripe returns the original charge object, no new charge is created. Without an idempotency key, the new worker creates a second charge.

How do I scope vault keys across multiple Temporal namespaces?

Create one vault key pair per billing context: a billing key (POST /v1/charges only) and an audit key (GET /v1/charges only) per namespace or product line. Store them as environment variables on the specific worker pools that run billing workflows. Workers running non-billing workflows should not have access to the billing vault key — a misconfigured worker shouldn't be able to charge customers even if it receives a billing task by mistake. The proxy's per-key daily spend cap provides a hard ceiling even if a vault key is accidentally shared.

Keybrake: vault keys and spend caps for Temporal billing agents

Issue scoped vault keys for each Temporal activity type. Set per-key daily USD caps that stop runaway retry loops before they exhaust your billing budget. Full audit log of every Stripe call with workflow ID and activity attempt number. One-line proxy URL override in your Temporal activity code.