Optimizing a Contextual Chatbot for the POLYCC: A Competition-Based LLM Fine-Tuning Experience
Keywords:
Large Language Model, Fine-Tuning, LoRA, POLYCC, Competition-Based EvaluationAbstract
Abstract— This paper reports a competition-based experience in fine-tuning a large language model (LLM) for the Malaysian Polytechnic and Community College (POLYCC) domain using the SynLoRA-SGS methodology. Participating in the student category of the POLYCC LLM League 2025, our team fine-tuned a Meta-Llama-3-8B-Instruct model through bilingual synthetic data generation and Low-Rank Adaptation (LoRA) hyperparameter optimization on Amazon SageMaker. Despite qualifying in last position (6th of 6 finalists) during the automated leaderboard stage, the team achieved second place overall in the final multi-source evaluation with a grand total of 149 points, including the highest spectator score among all finalists. This result demonstrates that focused dataset refinement between competition stages can produce substantial performance gains, and that a systematic fine-tuning approach can overcome initial ranking disadvantages when evaluated through holistic multi-source assessment.

