LLM-as-a-judge evaluation uses a language learning model to assess outputs from AI systems, but it often encounters skepticism due to potential circular reasoning and disappointing initial results. To enhance reliability, giving the LLM an "unfair advantage" can improve evaluations by simplifying tasks and providing clearer criteria. Examples include using multi-modal advantages, which leverage visual representations, and occasionally deploying stronger models for more complex reasoning tasks. However, relying on general rubrics or stronger models without additional context often yields less effective results. The article emphasizes creating unfair advantages to improve evaluation reliability and mentions Gentrace as a tool for building and monitoring these evaluations.