GenAI KUWE | Notion

About

The KUWE project is a joint initiative by the Harvard Digital China Initiative and the China Biographical Database Project. The project aims to collect the most useful large language model projects for East Asian Studies. It is currently maintained by Kwok-leong Tang, Hongsu Wang, Wenxin Xiao, and Helen He. If you have any suggestions or comments, please don’t hesitate to contact the primary maintainer of this website, Hongsu Wang, at hongsuwang(at)fas.harvard.edu.

New Models

More Models…

google/embeddinggemma-300m https://huggingface.co/google/embeddinggemma-300m

EmbeddingGemma is a 300M parameter, state-of-the-art for its size, open embedding model from Google, built from Gemma 3 (with T5Gemma initialization) and the same research and technology used to create Gemini models. EmbeddingGemma produces vector representations of text, making it well-suited for search and retrieval tasks, including classification, clustering, and semantic similarity search. This model was trained with data in 100+ spoken languages.

The small size and on-device focus makes it possible to deploy in environments with limited resources such as mobile phones, laptops, or desktops, democratizing access to state of the art AI models and helping foster innovation for everyone.

Inputs and outputs

Input:
- Text string, such as a question, a prompt, or a document to be embedded
- Maximum input context length of 2048 tokens
Output:
- Numerical vector representations of input text data
- Output embedding dimension size of 768, with smaller options available (512, 256, or 128) via Matryoshka Representation Learning (MRL). MRL allows users to truncate the output embedding of size 768 to their desired size and then re-normalize for efficient and accurate representation.

Qwen/Qwen3-Next-80B-A3B-Instruct

https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct

Over the past few months, we have observed increasingly clear trends toward scaling both total parameters and context lengths in the pursuit of more powerful and agentic artificial intelligence (AI). We are excited to share our latest advancements in addressing these demands, centered on improving scaling efficiency through innovative model architecture. We call this next-generation foundation models Qwen3-Next.

Qwen3-Next-80B-A3B is the first installment in the Qwen3-Next series and features the following key enchancements:

Hybrid Attention: Replaces standard attention with the combination of Gated DeltaNet and Gated Attention, enabling efficient context modeling for ultra-long context length.
High-Sparsity Mixture-of-Experts (MoE): Achieves an extreme low activation ratio in MoE layers, drastically reducing FLOPs per token while preserving model capacity.
Stability Optimizations: Includes techniques such as zero-centered and weight-decayed layernorm, and other stabilizing enhancements for robust pre-training and post-training.