Design and optimize large language models, devise fine-tuning strategies, and streamline the training process.
Explore deep learning architectures like Seq2Seq, Transformer, and advanced techniques including Fine-tuning, Prompt Engineering, and Soft Prompting (SFT).
Develop systems for efficient model training and deployment, involving data preprocessing, parallel training, and resource management.
Establish performance evaluation systems and monitor training metrics to ensure model quality and iteration efficiency.
Requirements:
Bachelor degree or higher in Computer Science, Artificial Intelligence, Mathematics, or related fields.