Latest news and articles about Sparse Attention
Total: 1 articles found
DeepSeek has released its V4-preview model featuring a massive 1M token context window and specialized versions for both high-end reasoning and low-cost efficiency. The launch utilizes new sparse attention technology and offers open-source weights, challenging the dominance of top-tier closed-source models.