OpenAI's GPT-4.1 and GPT-4.5, with enhanced coding and instruction-following capabilities, pose unique security challenges that are explored through adversarial red teaming using Promptfoo. The guide emphasizes the importance of systematically testing these models to identify vulnerabilities, particularly due to their advanced long-context processing and literal interpretation abilities that could be exploited for malicious purposes. It outlines the setup process for creating a red teaming project, configuring the environment, and running evaluations to generate and execute test cases that probe for weaknesses. The guide also discusses comparing model variants, such as GPT-4.1 and GPT-4o, by using specific configurations to assess their security postures. Techniques for customizing test cases and ensuring compliance with security frameworks like OWASP and NIST are provided, alongside recommendations for regular testing, developing custom plugins, and integrating these evaluations into CI/CD pipelines to enhance security measures over time.