Optimizing Contextual Chatbots via Synthetic Data Fine-Tuning and Selective Grid Search: Explorations from an LLM Competition

Authors

  • Mohd Suhairi Md Suhaimin Politeknik Kota Bharu Author
  • Norsuzila Shafie Politeknik Kota Bharu Author
  • Wan Siti Rodziah Mohd Nasir Politeknik Kota Bharu Author

Keywords:

Large Language Model, Fine-tuning, Synthetic Data, Selective Grid Search, AI Competition

Abstract

This paper explores the optimization of contextual chatbots through a strategic combination of synthetic data fine-tuning and selective hyperparameter tuning. Developed within the competition of the POLYCC LLM League 2025, the paper addresses the challenge of enhancing lower-tier Large Language Models (LLMs) under stringent architectural and computational constraints. The proposed methodology integrates a three-layer pipeline: (1) multi-model synthetic data generation, (2) Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning, and (3) a multi-stage competition evaluation. Moving beyond exhaustive search methods, a selective grid search strategy was implemented to identify the optimal balance between performance gains and training overhead. Utilizing AWS SageMaker, the model was rigorously evaluated through an automated qualification phase followed by a multi-dimensional final assessment involving AI metrics, expert validation, and audience sentiment. Our findings reveal that data quality and targeted LoRA parameter selection (r and α) yield superior performance compared to simply increasing dataset volume. The resulting model demonstrated significantly improved contextual grounding and generalization, ultimately securing the highest overall ranking (1st Place) among the competition finalists. These results provide a strategic roadmap for deploying high-performance LLMs in resource-constrained and applied domain-specific environments.

Downloads

Download data is not yet available.

Downloads

Published

04.06.2026

How to Cite

Optimizing Contextual Chatbots via Synthetic Data Fine-Tuning and Selective Grid Search: Explorations from an LLM Competition. (2026). Journal of STEM and Education, 6(1), 5-12. https://journalstem.net/ojs/index.php/pkb/article/view/148