Understanding and managing LLM token usage across various workloads, teams, and providers is crucial for organizations to control costs and optimize efficiency. Tokens, which represent computational work, are the basis for billing and performance metrics in AI models, and their consumption can be inconsistent due to differing provider tokenization strategies and inefficiencies like retries or agent loops. A comprehensive framework for tracking token usage involves tagging tokens with identity and purpose, accounting for both input and output tokens, attributing usage to specific teams or workloads, and enforcing budgeting and rate limits. Portkey addresses these challenges by acting as an AI gateway that centralizes model access, standardizes token behavior across providers, and provides detailed observability and reporting. It enables platform teams to enforce policies, track token efficiency, and ensure accountability, turning token usage from a static billing detail into a controllable resource.