Temporal Stripe Integration: Restricted API Keys, Spend Caps, and Agent Governance
Temporal's durable execution model is the most reliable foundation for AI agent workflows that touch money. The same durability features — automatic activity retries, workflow replay-based recovery, and child workflow fan-out — are the ones that silently create duplicate Stripe charges when billing activities fail at the wrong moment.
Temporal is a durable workflow platform that makes long-running processes resilient by replaying workflow code from event history on worker failure, retrying failed activities according to a configurable RetryPolicy, and persisting workflow state across process restarts. For AI agents handling Stripe billing — subscription renewals, usage-based invoicing, metered payment processing — Temporal's durability is a natural fit. You don't want a billing workflow to simply disappear when a worker crashes mid-charge. The challenge is that Temporal's retry and replay semantics operate at the infrastructure layer and have no awareness of Stripe's charge semantics. A Stripe charge that completes successfully is not rolled back when Temporal retries the containing activity. Temporal's event-sourced replay faithfully re-executes any code placed in the wrong context. These two properties, combined, produce silent double billing in patterns that otherwise look correct.
This post covers three failure modes specific to Temporal's architecture — activity retry, workflow replay violation, and child workflow fan-out — with Python SDK code for each, and the two-layer governance pattern that closes all three.
Failure mode 1: Activity retry policy fires a duplicate charge
Temporal activities carry a RetryPolicy that is applied automatically when an activity raises an exception. The default policy retries indefinitely with exponential backoff starting at one second. This is the right default for most network-dependent operations: if a database write times out, you want the activity to retry until it succeeds. For Stripe billing, the default retry semantics are dangerous.
The failure pattern: a charge_stripe activity calls stripe.charges.create(). The Stripe API call succeeds — the charge is created and a charge ID is returned. The activity then attempts to write the charge record to a database. The database write raises a connection error. Temporal receives the exception, marks the activity as failed, waits one second, and re-runs the activity. The activity calls stripe.charges.create() again with the same parameters. Because no idempotency key was provided in the original call, Stripe treats this as a new request and creates a second charge. The customer is billed twice.
# billing_activities.py — UNSAFE: no idempotency key, default retry policy
import stripe
from temporalio import activity
stripe.api_key = "sk_live_..." # Bare production key — full scope
@activity.defn
async def charge_stripe(
customer_id: str, amount_cents: int, billing_period: str
) -> dict:
# Stripe charge succeeds — charge object created
charge = stripe.charges.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
description=f"Subscription {billing_period}",
# No idempotency_key — every retry = new Stripe charge
)
# If this raises, Temporal retries the entire activity from the top
# stripe.charges.create() runs again → second charge, no dedup
write_charge_to_db(customer_id, charge["id"], billing_period)
return {"charge_id": charge["id"], "status": "succeeded"}
When write_charge_to_db raises a DatabaseConnectionError, Temporal retries charge_stripe. The second invocation calls stripe.charges.create() again with identical parameters. Stripe sees a new request and creates charge ch_B. The customer has been charged twice for the same billing period — and both charges appear legitimate in the Stripe dashboard because both completed successfully.
The fix adds a content-hash idempotency key derived from the stable billing parameters, and separates the database write error from the charge result so Temporal can retry the write independently:
# billing_activities.py — SAFE: content-hash idempotency key + separated write
import hashlib
import stripe
from temporalio import activity
from temporalio.common import RetryPolicy
from datetime import timedelta
def make_charge_idempotency_key(
customer_id: str, amount_cents: int, billing_period: str
) -> str:
payload = f"{customer_id}:{amount_cents}:{billing_period}:temporal-billing"
return hashlib.sha256(payload.encode()).hexdigest()[:36]
@activity.defn
async def charge_stripe(
customer_id: str, amount_cents: int, billing_period: str
) -> dict:
idempotency_key = make_charge_idempotency_key(
customer_id, amount_cents, billing_period
)
# Stripe returns the original charge on any retry with the same key
charge = stripe.charges.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
description=f"Subscription {billing_period}",
idempotency_key=idempotency_key,
)
return {"charge_id": charge["id"], "status": "succeeded"}
@activity.defn
async def write_charge_record(
customer_id: str, charge_id: str, billing_period: str
) -> dict:
# DB write is a separate activity — retries here don't re-fire Stripe
write_charge_to_db(customer_id, charge_id, billing_period)
return {"written": True}
# workflow.py — two activities: charge, then record
@workflow.defn
class BillingWorkflow:
@workflow.run
async def run(self, customer_id: str, amount_cents: int, billing_period: str):
charge_result = await workflow.execute_activity(
charge_stripe,
args=[customer_id, amount_cents, billing_period],
start_to_close_timeout=timedelta(seconds=30),
retry_policy=RetryPolicy(maximum_attempts=5),
)
await workflow.execute_activity(
write_charge_record,
args=[customer_id, charge_result["charge_id"], billing_period],
start_to_close_timeout=timedelta(seconds=10),
retry_policy=RetryPolicy(maximum_attempts=10),
)
The idempotency key is a SHA-256 hash of customer_id + amount_cents + billing_period — stable across all activity retries because those values don't change between attempts. Stripe returns the original ch_xxx object on every retry. The database write is now a separate activity with its own retry policy — if the write fails, only the write retries, never the charge. The activities can have different maximum_attempts settings: five for the charge (which is safe to retry with idempotency), ten for the write (which is safe to retry repeatedly).
Failure mode 2: Stripe calls placed in workflow code re-execute on every replay
Temporal's durability model relies on a fundamental constraint: workflow code must be deterministic and side-effect-free. A Temporal workflow is replayed from its event history on every worker restart. When a worker picks up a workflow after a crash, it re-runs the workflow function from the beginning, stepping through the event history to reconstruct state. Any code in the workflow function body that is not an await workflow.execute_activity() call executes again on every replay.
The failure pattern: a developer places a Stripe API call directly inside the @workflow.defn function instead of wrapping it in an @activity.defn. This looks superficially correct — the charge runs, the workflow continues. The first run succeeds. On the next worker restart — due to a deploy, a crash, or a Temporal Cloud namespace rebalance — the workflow is replayed. The stripe.charges.create() call in the workflow body executes again. Temporal's event history records the Stripe call result from the first run as a workflow event, but the new call creates a new charge before Temporal can compare the result against history. The customer is charged on every worker restart.
# workflow.py — UNSAFE: Stripe call in workflow body, not in an activity
import stripe
from temporalio import workflow
@workflow.defn
class BillingWorkflow:
@workflow.run
async def run(self, customer_id: str, amount_cents: int, billing_period: str):
# WRONG: Direct Stripe call in workflow code
# This re-executes on every Temporal replay
charge = stripe.charges.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
description=f"Subscription {billing_period}",
)
# The charge fires again on every worker restart / workflow replay
charge_id = charge["id"]
# Continue processing with charge_id...
await workflow.execute_activity(
send_receipt,
args=[customer_id, charge_id],
start_to_close_timeout=timedelta(seconds=10),
)
Temporal will log a non-determinism error when it detects that the workflow produces different results across replays (the charge ID changes each time). But by the time this error surfaces, multiple charges may already have fired. The Temporal SDK does attempt to detect workflow non-determinism, but the detection happens at the replay boundary — after the API call has already executed.
The fix is unambiguous: all Stripe API calls must live in @activity.defn functions. Activities are executed exactly once per invocation according to the event history — Temporal does not re-run an activity that already has a completion event in the history:
# workflow.py — SAFE: Stripe call wrapped in activity
from temporalio import workflow, activity
from datetime import timedelta
@activity.defn
async def charge_stripe_activity(
customer_id: str, amount_cents: int, billing_period: str
) -> dict:
# Activity functions are the correct place for Stripe calls
# Temporal executes this exactly once per workflow invocation
idempotency_key = make_charge_idempotency_key(
customer_id, amount_cents, billing_period
)
charge = stripe.charges.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
description=f"Subscription {billing_period}",
idempotency_key=idempotency_key,
)
return {"charge_id": charge["id"]}
@workflow.defn
class BillingWorkflow:
@workflow.run
async def run(self, customer_id: str, amount_cents: int, billing_period: str):
# Workflow code: only awaitable Temporal SDK calls, no side effects
result = await workflow.execute_activity(
charge_stripe_activity,
args=[customer_id, amount_cents, billing_period],
start_to_close_timeout=timedelta(seconds=30),
)
await workflow.execute_activity(
send_receipt,
args=[customer_id, result["charge_id"]],
start_to_close_timeout=timedelta(seconds=10),
)
When Temporal replays this workflow after a worker restart, it finds the charge_stripe_activity completion event in the history and skips re-executing the activity — returning the stored result directly. The Stripe API is never called again. This is Temporal's core safety guarantee, but it only applies to code placed inside execute_activity(), never to code in the workflow function body.
Failure mode 3: Child workflow fan-out re-bills on parent retry
A common Temporal pattern for batch billing — charging every customer in a subscription tier at the start of a billing cycle — uses child workflows. The parent workflow iterates a list of customers and calls execute_child_workflow() for each one. Each child workflow handles one customer's charge, receipt, and record. The parent workflow completes when all children complete. This architecture is clean, observable, and idiomatic Temporal.
The failure emerges when the parent workflow is interrupted and retried. Consider a batch billing job for 400 customers. The parent workflow starts child workflows for customers 0–399. After the first 200 children complete, the Temporal worker crashes. When the worker restarts, Temporal replays the parent workflow. For the first 200 customers, Temporal finds completed child workflow execution events in the history and skips re-starting those children. For customers 201–399, Temporal re-dispatches the child workflows — correctly, since those children never completed. The children for customers 201–399 call stripe.charges.create() with no idempotency key. All 199 charges fire correctly.
Now consider the more dangerous scenario: the parent workflow is retried by the caller — not by Temporal's internal recovery — using a new workflow_id. This happens when an operator sees the original workflow in a TERMINATED state and manually re-runs the billing job. Temporal sees a new workflow execution. No prior history exists. The parent dispatches child workflows for all 400 customers. All 400 children call Stripe. The 200 customers who were charged in the first run are charged again.
# billing_workflows.py — UNSAFE: child workflows without idempotency keys
from temporalio import workflow, activity
from datetime import timedelta
@activity.defn
async def charge_customer_activity(customer_id: str, amount_cents: int, billing_period: str) -> dict:
# No idempotency key — each new child workflow invocation creates a new charge
charge = stripe.charges.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
description=f"Subscription {billing_period}",
)
return {"charge_id": charge["id"]}
@workflow.defn
class CustomerBillingWorkflow:
@workflow.run
async def run(self, customer_id: str, amount_cents: int, billing_period: str) -> dict:
return await workflow.execute_activity(
charge_customer_activity,
args=[customer_id, amount_cents, billing_period],
start_to_close_timeout=timedelta(seconds=30),
)
@workflow.defn
class BatchBillingWorkflow:
@workflow.run
async def run(self, customers: list, billing_period: str) -> list:
results = []
for customer in customers:
# Each child workflow has its own execution scope
# If the parent is re-run (new workflow_id), all children re-fire
result = await workflow.execute_child_workflow(
CustomerBillingWorkflow.run,
args=[customer["id"], customer["amount_cents"], billing_period],
id=f"billing-{customer['id']}-{billing_period}", # deterministic ID
)
results.append(result)
return results
The child workflow ID billing-{customer_id}-{billing_period} is deterministic — this protects against duplicate child workflow dispatches within the same parent execution. But it does not protect against a new parent execution re-dispatching the same child workflow IDs (Temporal raises WorkflowAlreadyStartedError for still-running children, but completed children can be re-started under the same ID in a new execution context). The fix combines two defenses: idempotency keys in the charge activity (so Stripe deduplicates even if the charge activity runs twice), and a pre-charge existence check that short-circuits the activity if the billing period is already recorded:
# billing_workflows.py — SAFE: idempotency key + pre-charge guard
import hashlib
import stripe
from temporalio import workflow, activity
from datetime import timedelta
def make_charge_idempotency_key(
customer_id: str, amount_cents: int, billing_period: str
) -> str:
payload = f"{customer_id}:{amount_cents}:{billing_period}:temporal-billing"
return hashlib.sha256(payload.encode()).hexdigest()[:36]
@activity.defn
async def check_existing_charge(customer_id: str, billing_period: str) -> dict:
# Uses audit vault key (GET /v1/charges only) — read-only, safe to retry
existing = get_charge_from_db(customer_id, billing_period)
if existing:
return {"charge_id": existing["charge_id"], "already_charged": True}
return {"already_charged": False}
@activity.defn
async def charge_customer_activity(
customer_id: str, amount_cents: int, billing_period: str
) -> dict:
idempotency_key = make_charge_idempotency_key(
customer_id, amount_cents, billing_period
)
charge = stripe.charges.create(
amount=amount_cents,
currency="usd",
customer=customer_id,
description=f"Subscription {billing_period}",
idempotency_key=idempotency_key,
)
write_charge_to_db(customer_id, charge["id"], billing_period)
return {"charge_id": charge["id"], "already_charged": False}
@workflow.defn
class CustomerBillingWorkflow:
@workflow.run
async def run(self, customer_id: str, amount_cents: int, billing_period: str) -> dict:
# Pre-flight check before Stripe call — catches re-run scenarios
existing = await workflow.execute_activity(
check_existing_charge,
args=[customer_id, billing_period],
start_to_close_timeout=timedelta(seconds=10),
)
if existing["already_charged"]:
return existing # Short-circuit: already billed this period
return await workflow.execute_activity(
charge_customer_activity,
args=[customer_id, amount_cents, billing_period],
start_to_close_timeout=timedelta(seconds=30),
retry_policy=RetryPolicy(maximum_attempts=5),
)
The check_existing_charge activity uses an audit vault key scoped to GET /v1/charges only — it can never create charges, so it is safe to retry without limit. If the customer was already billed for this period (in a prior execution of the same workflow, or in a prior run of the batch job), the child workflow short-circuits before touching Stripe. The idempotency key in charge_customer_activity provides a second layer of protection at the Stripe API level for cases where the DB record hasn't been written yet when the re-run occurs.
The complete governance pattern
All three failure modes share a common root: Temporal's retry and replay semantics operate transparently at the infrastructure layer, below the Stripe API call. The two-layer governance pattern addresses both the Stripe API layer and the key-scoping layer:
| Layer | Mechanism | What it prevents |
|---|---|---|
| Idempotency key | SHA-256 hash of customer_id + amount_cents + billing_period + "temporal-billing" |
Duplicate charges on activity retry; charges on workflow re-run |
| Activity separation | All Stripe calls inside @activity.defn, never in workflow body |
Re-execution of Stripe calls during Temporal workflow replay |
| Pre-charge guard | check_existing_charge activity reads DB before calling Stripe |
Re-billing on parent workflow re-run or operator retry |
| Vault key scoping | Billing vault key: POST /v1/charges only, daily USD cap. Audit vault key: GET /v1/charges only. |
Runaway retry loops from exceeding daily spend cap; accidental charge from audit key |
| Proxy routing | stripe.StripeClient(base_url="https://proxy.keybrake.com/stripe") |
Enforces vault key scope and spend cap at network layer |
Routing through a proxy adds one line of configuration in the Temporal activity:
# billing_activities.py — vault key via Keybrake proxy
import stripe
from temporalio import activity
@activity.defn
async def charge_stripe_activity(
customer_id: str, amount_cents: int, billing_period: str
) -> dict:
# Vault key scoped to POST /v1/charges only; daily cap = max expected batch
billing_client = stripe.StripeClient(
api_key=activity.info().heartbeat_details or os.environ["VAULT_KEY_BILLING"],
base_url="https://proxy.keybrake.com/stripe",
)
idempotency_key = make_charge_idempotency_key(
customer_id, amount_cents, billing_period
)
charge = billing_client.charges.create(
params={
"amount": amount_cents,
"currency": "usd",
"customer": customer_id,
"description": f"Subscription {billing_period}",
},
options={"idempotency_key": idempotency_key},
)
return {"charge_id": charge.id}
@activity.defn
async def check_existing_charge(customer_id: str, billing_period: str) -> dict:
# Audit vault key — GET /v1/charges only
audit_client = stripe.StripeClient(
api_key=os.environ["VAULT_KEY_AUDIT"],
base_url="https://proxy.keybrake.com/stripe",
)
existing = get_charge_from_db(customer_id, billing_period)
if existing:
return {"charge_id": existing["charge_id"], "already_charged": True}
return {"already_charged": False}
Governance comparison
| Pattern | Bare key risk | Restricted key | Idempotency key | Spend cap | Audit log | Full governance |
|---|---|---|---|---|---|---|
| Temporal activity retry (no idem key) | Duplicate charge per retry attempt | Doesn't prevent duplicate charges | Stripe deduplicates across all retries | Caps retry blast radius | Each attempt logged | All three layers |
| Stripe call in workflow body | New charge on every replay | Doesn't prevent replay charges | Partially helps; architecture fix required | Caps replay blast radius | Replay charges logged | Move to activity + idem key |
| Child workflow fan-out re-run | Full customer list re-billed | Doesn't prevent re-billing | Deduplicates at Stripe layer | Caps per-run spend before batch exhausts | Re-run charges logged | Pre-charge guard + idem key + vault key |
Enforcement with pytest
# tests/test_temporal_billing_governance.py
import hashlib
import pytest
from unittest.mock import patch, MagicMock
def make_idempotency_key(customer_id, amount_cents, billing_period):
payload = f"{customer_id}:{amount_cents}:{billing_period}:temporal-billing"
return hashlib.sha256(payload.encode()).hexdigest()[:36]
def test_idempotency_key_is_stable_across_retries():
key1 = make_idempotency_key("cus_test", 9900, "2026-06")
key2 = make_idempotency_key("cus_test", 9900, "2026-06")
assert key1 == key2
def test_different_periods_produce_different_keys():
key_june = make_idempotency_key("cus_test", 9900, "2026-06")
key_july = make_idempotency_key("cus_test", 9900, "2026-07")
assert key_june != key_july
@pytest.mark.asyncio
async def test_charge_activity_passes_idempotency_key_to_stripe():
with patch("stripe.charges.create") as mock_create:
mock_create.return_value = MagicMock(id="ch_test_123")
from billing_activities import charge_stripe_activity
result = await charge_stripe_activity("cus_test", 9900, "2026-06")
call_kwargs = mock_create.call_args[1]
assert "idempotency_key" in call_kwargs
assert call_kwargs["idempotency_key"] == make_idempotency_key(
"cus_test", 9900, "2026-06"
)
@pytest.mark.asyncio
async def test_existing_charge_short_circuits_before_stripe():
with patch("billing_activities.get_charge_from_db") as mock_db:
mock_db.return_value = {"charge_id": "ch_already_done"}
with patch("stripe.charges.create") as mock_stripe:
from billing_workflows import CustomerBillingWorkflow
# Simulate check_existing_charge returning already_charged=True
existing = {"charge_id": "ch_already_done", "already_charged": True}
mock_stripe.assert_not_called()
assert existing["already_charged"] is True
assert existing["charge_id"] == "ch_already_done"
def test_vault_key_env_vars_are_separate():
import os
billing_key = os.environ.get("VAULT_KEY_BILLING", "")
audit_key = os.environ.get("VAULT_KEY_AUDIT", "")
if billing_key and audit_key:
assert billing_key != audit_key, "Billing and audit vault keys must be different"
Gap analysis
- Temporal's default
max_attempts=unlimited. By default, Temporal retries activities indefinitely. A billing activity without an idempotency key and without a spend cap can fire a charge on every retry until the key expires or manual intervention stops the workflow. Settingmaximum_attemptson the activity'sRetryPolicylimits this, but the idempotency key is the correct primary fix — it makes the retry count irrelevant at the Stripe layer. workflow.continue_as_new()truncates event history. When a long-running workflow callscontinue_as_new()to reset its event history (required after Temporal's 50,000 event limit), the new execution starts with no prior events. Idempotency keys that were passed as prior workflow results are not available in the new execution's history. If you derive idempotency keys from stable parameters (customer ID + billing period), they remain correct acrosscontinue_as_new()calls. Keys derived from workflow run IDs or activity IDs from the previous execution are not.schedule_to_close_timeoutvsstart_to_close_timeoutretry semantics. Temporal distinguishes between the time allowed for an activity to be scheduled and started (schedule_to_start_timeout), the time for a started activity to complete (start_to_close_timeout), and the total time including all retries (schedule_to_close_timeout). Aschedule_to_close_timeoutthat expires while an activity is mid-retry causes aTimeoutError— the workflow fails even though the last retry's charge may have succeeded. The billing record may be missing even though the charge exists. Setschedule_to_close_timeoutgenerously for billing activities, or use heartbeating to extend the deadline.- Temporal Cloud namespace isolation for vault key scoping. In Temporal Cloud, each namespace is billed and governed independently. If multiple product lines or teams share a Temporal namespace, their billing workflows share the same activity worker environment variables — including vault keys. A misconfigured worker startup that sets the wrong
VAULT_KEY_BILLINGfor a namespace contaminates all billing activities in that namespace. Use separate namespaces or separate worker pools with separate environment variables for each billing context.
FAQ
Temporal deduplicates workflows by workflow ID — doesn't that protect me from duplicate charges?
Temporal's workflow ID uniqueness prevents two workflows with the same ID from running simultaneously in the same namespace. It does not prevent a completed workflow from being re-run under a new execution (the same workflow ID can be reused after a workflow terminates). It also doesn't prevent the activities within a workflow from being retried. The idempotency key operates at the Stripe API layer — it's the only mechanism that deduplicates charges regardless of how many times the activity runs.
Should I set maximum_attempts=1 for billing activities to prevent retries?
Disabling retries on billing activities creates a different problem: a transient Stripe API timeout (HTTP 503, connection reset) causes the billing workflow to fail permanently, requiring manual intervention for every network hiccup. With an idempotency key, you can safely allow retries — Stripe's idempotency guarantees that any number of retries with the same key return the same charge object. Set maximum_attempts to a small number (3–5) to prevent infinite retry storms, but don't set it to 1.
Can I use the Temporal workflow run ID as the idempotency key?
A workflow's run ID changes when the workflow is retried or re-run. Using the run ID as the idempotency key means a retry creates a new idempotency key — Stripe sees it as a new request and creates a new charge. Use stable business parameters (customer ID + amount + billing period) instead. These values are identical across all retries and re-runs of the same billing intent.
How does heartbeating interact with billing activities?
Temporal heartbeating (activity.heartbeat()) extends the activity's deadline and lets the activity pass state back to Temporal in case it's cancelled and retried on a different worker. For billing activities, you can use heartbeat details to pass the charge ID back to Temporal: after stripe.charges.create() succeeds, call activity.heartbeat(charge_id). If the activity is cancelled before completing, the next retry can check activity.info().heartbeat_details for the charge ID and skip the Stripe call entirely.
What happens to in-flight Stripe charges during a Temporal worker crash?
If the Temporal worker crashes while a stripe.charges.create() call is in-flight, Temporal's scheduler detects the lost heartbeat and schedules the activity on a new worker. The new worker starts the activity from the beginning. If Stripe completed the charge before the crash, Stripe recorded the charge. The new worker calls stripe.charges.create() again with the same idempotency key — Stripe returns the original charge object, no new charge is created. Without an idempotency key, the new worker creates a second charge.
How do I scope vault keys across multiple Temporal namespaces?
Create one vault key pair per billing context: a billing key (POST /v1/charges only) and an audit key (GET /v1/charges only) per namespace or product line. Store them as environment variables on the specific worker pools that run billing workflows. Workers running non-billing workflows should not have access to the billing vault key — a misconfigured worker shouldn't be able to charge customers even if it receives a billing task by mistake. The proxy's per-key daily spend cap provides a hard ceiling even if a vault key is accidentally shared.
Keybrake: vault keys and spend caps for Temporal billing agents
Issue scoped vault keys for each Temporal activity type. Set per-key daily USD caps that stop runaway retry loops before they exhaust your billing budget. Full audit log of every Stripe call with workflow ID and activity attempt number. One-line proxy URL override in your Temporal activity code.