Retry Policy from HTTP Responses | Temporal Platform Documentation
Uncategorized page from Temporal
On this page
This recipe extends the Basic Example to show how to extract retry information from HTTP response headers and make it available to Temporal's retry mechanisms.
HTTP response codes and headers on API calls have implications for retry behavior.
For example, an HTTP 404 Not Found generally represents an application-level error that should not be retried.
By contrast, a 500 Internal Server Error is typically transient, so should be retried.
Servers can also set the Retry-After header to tell the client when to retry.
This recipe introduces a utility function that processes the HTTP response and populates a Temporal ApplicationError to provide inputs to the retry mechanism.
Temporal combines this information with other configuration, such as timeouts and exponential backoff, to implement the complete retry policy.
Generate Temporal ApplicationErrors from HTTP responses
We introduce a utility function that takes an httpx.Response object and returns a Temporal ApplicationError with two key fields populated: non-retryable and next_retry_delay.
The non-retryable is determined by categorizing the HTTP status codes.
The X-Should-Retry HTTP response header, when present, overrides the status code.
Example Retryable Status Codes:
- 408 Request Timeout → Retry because server is unresponsive, which can have many causes
- 409 Conflict → Retry when resource is temporarily locked or in use
- 429 Too Many Requests → Retry after rate limit cooldown (respect
Retry-Afterheader when available) - 500 Internal Server Error → Retry for temporary server issues
- 502 Bad Gateway → Retry when upstream server is temporarily unavailable
- 503 Service Unavailable → Retry when service is temporarily overloaded
- 504 Gateway Timeout → Retry when upstream server times out
Example Non-Retryable Status Codes:
- 400 Bad Request → Do not retry - fix request format/parameters
- 401 Unauthorized → Do not retry - provide valid authentication
- 403 Forbidden → Do not retry - insufficient permissions
- 404 Not Found → Do not retry - resource does not exist
- 422 Unprocessable Entity → Do not retry - fix request validation errors
- Other 4xx Client Errors → Do not retry - client-side issues need fixing
- 2xx Success → Do not expect to see this - call succeeded
- 3xx Redirects → Do not expect to see this - typically handled by httpx (with
follow_redirects=True)
If the error is retryable and if the Retry-After header is present, we parse it to set the retry delay.
This implementation duplicates logic present in the OpenAI Python API Library, where it is part of the code generated by Stainless. Duplicating the logic makes sense because it is not accessible via the public library interface and because it applies to HTTP APIs in general, not just the OpenAI API.
File: util/http_retry.py
import email.utils
import time
from datetime import timedelta
from temporalio.exceptions import ApplicationError
from temporalio import workflow
from typing import Optional, Tuple
with workflow.unsafe.imports_passed_through():
from httpx import Response, Headers
# Adapted from the OpenAI Python client (https://github.com/openai/openai-python/blob/main/src/openai/_base_client.py)
# which is generated by the Stainless SDK Generator.
def _parse_retry_after_header(response_headers: Optional[Headers] = None) -> float | None:
"""Returns a float of the number of seconds (not milliseconds) to wait after retrying, or None if unspecified.
About the Retry-After header: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After
See also https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After#syntax
"""
if response_headers is None:
return None
# First, try the non-standard `retry-after-ms` header for milliseconds,
# which is more precise than integer-seconds `retry-after`
try:
retry_ms_header = response_headers.get("retry-after-ms", None)
return float(retry_ms_header) / 1000
except (TypeError, ValueError):
pass
# Next, try parsing `retry-after` header as seconds (allowing nonstandard floats).
retry_header = response_headers.get("retry-after")
try:
# note: the spec indicates that this should only ever be an integer
# but if someone sends a float there's no reason for us to not respect it
return float(retry_header)
except (TypeError, ValueError):
pass
# Last, try parsing `retry-after` as a date.
retry_date_tuple = email.utils.parsedate_tz(retry_header)
if retry_date_tuple is None:
return None
retry_date = email.utils.mktime_tz(retry_date_tuple)
return float(retry_date - time.time())
def _should_retry(response: Response) -> Tuple[bool, str]:
# Note: this is not a standard header
should_retry_header = response.headers.get("x-should-retry")
# If the server explicitly says whether or not to retry, obey.
if should_retry_header == "true":
return True, f"Server requested retry via x-should-retry=true header (HTTP {response.status_code})"
if should_retry_header == "false":
return False, f"Server prevented retry via x-should-retry=false header (HTTP {response.status_code})"
# Retry on request timeouts.
if response.status_code == 408:
return True, f"HTTP request timeout ({response.status_code}), will retry with backoff"
# Retry on lock timeouts.
if response.status_code == 409:
return True, f"HTTP conflict/lock timeout ({response.status_code}), will retry with backoff"
# Retry on rate limits.
if response.status_code == 429:
return True, f"HTTP rate limit exceeded ({response.status_code}), will retry with backoff"
# Retry internal errors.
if response.status_code >= 500:
return True, f"HTTP server error ({response.status_code}), will retry with backoff"
return False, f"HTTP client error ({response.status_code}), not retrying - check your request"
def http_response_to_application_error(response: Response) -> ApplicationError:
"""Transform HTTP response into Temporal ApplicationError for retry handling.
This function implements generic HTTP retry logic based on status codes and headers.
Args:
response: The httpx.Response from a failed HTTP request
Returns:
ApplicationError: Always returns an ApplicationError configured for Temporal's retry system:
- non_retryable: False for retryable errors, True for non-retryable
- next_retry_delay: Server-provided delay hint (if valid)
Note:
Even when x-should-retry=true, this function returns an ApplicationError with
non_retryable=False rather than raising an exception, for cleaner functional style.
"""
should_retry, retry_message = _should_retry(response)
if should_retry:
# Calculate the retry delay only when retrying
retry_after = _parse_retry_after_header(response.headers)
# Make sure that the retry delay is in a reasonable range
if retry_after is not None and 0 < retry_after <= 60:
retry_after = timedelta(seconds=retry_after)
else:
retry_after = None
# Add delay info for rate limits
if response.status_code == 429 and retry_after is not None:
retry_message = f"HTTP rate limit exceeded (429) (server requested {retry_after.total_seconds():.1f}s delay), will retry with backoff"
return ApplicationError(
retry_message,
non_retryable=False,
next_retry_delay=retry_after,
)
else:
return ApplicationError(
retry_message,
non_retryable=True,
next_retry_delay=None,
)
Raise the exception from the Activity
When API calls fail, the OpenAI Client raises an APIStatusError exception which contains a response field, containing the underlying httpx.Response object.
We use the http_response_to_application_error function defined above to translate this to a Temporal ApplicationError, which we re-throw to pass the retry information to Temporal.
File: activities/openai_responses.py
from temporalio import activity
from openai import AsyncOpenAI
from openai.types.responses import Response
from dataclasses import dataclass
from util.http_retry import http_response_to_application_error
from openai import APIStatusError
# Temporal best practice: Create a data structure to hold the request parameters.
@dataclass
class OpenAIResponsesRequest:
model: str
instructions: str
input: str
@activity.defn
async def create(request: OpenAIResponsesRequest) -> Response:
# Temporal best practice: Disable retry logic in OpenAI API client library.
client = AsyncOpenAI(max_retries=0)
try:
resp = await client.responses.create(
model=request.model,
instructions=request.instructions,
input=request.input,
timeout=15,
)
return resp
except APIStatusError as e:
raise http_response_to_application_error(e.response) from e
Running
Start the Temporal Dev Server:
temporal server start-dev
Run the worker:
uv run python -m worker
Start execution:
uv run python -m start_workflow
Looking through this documentation, I found several issues that need to be fixed:
Location: In the "Generate Temporal ApplicationErrors from HTTP responses" section, in the code example
Context: "def _parse_retry_after_header(response_headers: Optional[Headers] = None) -> float | None: """Returns a float of the number of seconds (not milliseconds) to wait after retrying, or None if unspecified."
Issue: The return type annotation uses >>>float | None<<< which is Python 3.10+ syntax
Problem: The pipe operator for union types (|) was introduced in Python 3.10. For broader compatibility, this should use the typing.Union syntax or typing.Optional
Fix: Change -> float | None: to -> Optional[float]: (and ensure Optional is imported from typing)
Location: In the "Generate Temporal ApplicationErrors from HTTP responses" section, in the _parse_retry_after_header function
Context: " # First, try the non-standard retry-after-ms header for milliseconds,
# which is more precise than integer-seconds retry-after
try:
retry_ms_header = response_headers.get("retry-after-ms", None)
>>>return float(retry_ms_header) / 1000<<<
except (TypeError, ValueError):
pass"
Problem: If retry_ms_header is None (which is the default from .get()), float(None) will raise a TypeError, but the code will silently continue due to the except block. While this works, it's attempting an operation that will always fail when the header is missing.
Fix: Add a check before the conversion:
retry_ms_header = response_headers.get("retry-after-ms", None)
if retry_ms_header is not None:
return float(retry_ms_header) / 1000
Location: In the "Generate Temporal ApplicationErrors from HTTP responses" section, in the _parse_retry_after_header function
Context: " # Last, try parsing retry-after as a date.
>>>retry_date_tuple = email.utils.parsedate_tz(retry_header)<<<
if retry_date_tuple is None:
return None"
Problem: The variable retry_header might be None at this point (if the header wasn't present), but parsedate_tz is being called on it without checking. While parsedate_tz does handle None by returning None, it's unclear and could be confusing.
Fix: Add a check before parsing:
# Last, try parsing `retry-after` as a date.
if retry_header is None:
return None
retry_date_tuple = email.utils.parsedate_tz(retry_header)
Location: In the "Raise the exception from the Activity" section, in the imports
Context: "from temporalio import activity from openai import AsyncOpenAI
from openai.types.responses import Response<<< from dataclasses import dataclass"
Problem: The import path openai.types.responses appears to be incorrect based on the OpenAI Python library structure. The Response type is typically imported differently.
Fix: This should likely be from openai.types import CompletionCreateResponse or similar, depending on the actual OpenAI library version and the specific response type being used. The exact import depends on which OpenAI API is being called.
Location: In the "Raise the exception from the Activity" section, in the create activity function
Context: " resp = await client.>>>responses<<<.create( model=request.model, instructions=request.instructions, input=request.input, timeout=15, )"
Problem: The OpenAI AsyncClient doesn't have a responses attribute. This appears to be using an incorrect API endpoint name.
Fix: Based on the parameters (model, instructions, input), this looks like it should be either completions.create() or chat.completions.create() depending on the intended API endpoint.
No product messaging analysis available for this page.
No competitive analysis available for this page.