The AI for Asian Studies project is a joint initiative by the Harvard Digital China Initiative and the China Biographical Database Project. The project aims to collect the most useful large language model projects for East Asian Studies. It is currently maintained by Hongsu Wang. If you have any suggestions or comments, please don’t hesitate to contact the primary maintainer of this website, Hongsu Wang, at hongsuwang(at)fas.harvard.edu.
google/translategemma-4b-it
https://huggingface.co/google/translategemma-4b-it
TranslateGemma is a family of lightweight, state-of-the-art open translation models from Google, based on the Gemma 3 family of models.
TranslateGemma models are designed to handle translation tasks across 55 languages. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art translation models and helping foster innovation for everyone.
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Nemotron-3-Nano-30B-A3B-BF16 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be configured through a flag in the chat template. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so, albeit with a slight decrease in accuracy for harder prompts that require reasoning. Conversely, allowing the model to generate reasoning traces first generally results in higher-quality final solutions to queries and tasks.
The model employs a hybrid Mixture-of-Experts (MoE) architecture, consisting of 23 Mamba-2 and MoE layers, along with 6 Attention layers. Each MoE layer includes 128 experts plus 1 shared expert, with 6 experts activated per token. The model has 3.5B active parameters and 30B parameters in total.
The supported languages include: English, German, Spanish, French, Italian, and Japanese. Improved using Qwen.
This model is ready for commercial use.

mistralai/Ministral-3-14B-Instruct-2512