DeepSeek Just Taught AI to Judge Itself
DeepSeek AI has introduced a groundbreaking method for training large language models (LLMs) through a new reward modeling technique called Self-Principled Critique Tuning (SPCT). This approach could make AI systems much more adaptable and reliable in open-ended, complex domains—where traditional