How LLM Settings Affect Prompt Engineering

Post Details

Company

Vectorize

Date Published

July 27, 2024

Author

Chris Latimer

Word Count

578

Language

English

Hacker News Points

-

Source URL

vectorize.io/blog/prompt-engineering-llm-settings

Summary

Communicating with a language model (LLM) when creating and testing prompts typically involves using an API and adjusting several parameters to achieve desired outcomes, with these adjustments requiring some trial and error. Key settings include "temperature," which affects the predictability versus creativity of responses by adjusting the likelihood of token selection; "top_p," a nucleus sampling method that influences the determinism of responses by considering tokens within a specific probability mass; and "max length," which controls the number of tokens the model generates to avoid excessively long outputs. Additionally, "stop sequences" can terminate responses upon reaching certain strings, while "frequency penalty" and "presence penalty" reduce the repetition of words by penalizing frequently or repeatedly used tokens, with a general recommendation to adjust either frequency or presence penalties, but not both. The effectiveness of these settings can vary depending on the LLM version used, emphasizing the need for experimentation to tailor responses for specific tasks like factual quality assurance or creative writing.