某科技公司
AI 后训练
生物科技
科技
上海
经验不限
学历不限
¥30 - 55K/月
职位描述
Design, build, and optimize post‑training workflows, including SFT, RLHF, RLAIF, DPO, and other preference‑based training methods
Develop high‑quality training datasets, including instruction data, preference data, and safety‑related data
Implement and maintain scalable training pipelines for fine‑tuning and alignment of large language models
Collaborate with research teams to experiment with new post‑training techniques and evaluate their effectiveness
Build automated evaluation frameworks for model quality, safety, robustness, and user experience
Analyze model outputs, identify weaknesses, and propose targeted improvements
Work with product teams to understand real‑world use cases and translate them into training and evaluation requirements
Ensure compliance with safety, ethics, and responsible AI guidelines throughout the post‑training process
Optimize model performance through prompt engineering, data curation, and iterative training cycles
职位要求
Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or related fields
2+ years of experience in machine learning, NLP, or model training
Strong understanding of LLM architectures, transformer models, and modern NLP techniques
Hands‑on experience with fine‑tuning, supervised learning, or reinforcement learning
Proficiency with Python and deep learning frameworks such as PyTorch or TensorFlow
Familiarity with distributed training, GPU acceleration, and large‑scale data processing
Strong analytical skills and ability to evaluate model behavior and performance
Ability to collaborate across research, engineering, and product teams