Temperature-Based Code Generation Using Ollama
Keywords:
Ollama, temperature manipulation, Large Language Models, prompt, API, code generationAbstract
Large language models (LLMs) are increasingly used for automated code generation, yet their reliability remains strongly dependent on inference-time parameters such as sampling temperature. The goal of this study was to analyse and compare temperature-based modifications in programming code generation using selected Ollama-based Llama models. By executing controlled experiments with a general-purpose Llama 3.1 model and the code-specialized Codellama model, this study evaluates how different temperature values affect syntactic correctness, logical consistency, and adherence to strict prompt constraints.
Two low-complexity programming tasks were selected to isolate temperature effects from algorithmic difficulty: basic statistical computation and prime number detection. The results show that lower temperature settings generally improve determinism and structural correctness, particularly for Codellama, which produced logically correct solutions at low temperatures. However, hallucinations were still observed, including incorrect algorithm naming and violations of output constraints. In contrast, Llama 3.1 frequently generated syntactically valid but logically incorrect code, regardless of temperature, indicating limitations in its suitability for strict programming tasks.
At higher temperature values, both models exhibited increased instability, with degraded control flow and reduced compliance with formal specifications. These findings suggest that temperature alone is insufficient to guarantee reliable code generation, though it can support improved output quality when combined with code-specialized models and strict validation mechanisms. The paper highlights the importance of model selection, prompt design, and post-generation verification in educational and evaluation-driven programming applications.