The 17th LLM-jp Meeting
The 17th LLM-jp meeting was held on March 25th, 2025, at the National Institute of Informatics and online.
Program
- LLM-jp Status Report (Kurohashi) Oral
<Evaluation and Turning/Principal Elucidation WG>
- Are Checklists Really Useful for Automatic Evaluation of Generative Tasks? (Furuhashi) [PDF]
- Introduction of Open Japanese LLM leaderboard and statistical analysis on evaluation results. (Namgi Han)[PDF]
- Analyzing the Pretraining of Japanese Large Language Models. (Nishida) [PDF]
- llm-jp-judge: Japanese LLM-as-a-Judge Evaluation Tool. (Kodama) [PDF]
- Understanding the Role of Persona and Internal Mechanisms in Large Language Models. (Ozaki)[PDF]
- How LLMs Learn: Tracing Internal Representations with Sparse Autoencoders. (Inaba)[PDF]
- A Massive Fine-tuned LLMs from Diverse Models, Tasks, Methods (Harada) [PDF]
- Comparative analysis of the Geospatial Representations in Large Language Models across Models and Languages (Otake) [PDF]
- Large-Scale Human Evaluation of LLMs for Japanese(Inoue) [PDF]
- A Study on Fine-tuning Methods for Balancing Usefulness and Safety in Japanese Large Language Models. (Katsumata)[PDF]
<Multi-modal WG>
- Developing Japanese CLIP Models Leveraging an Open-weight LLM for Large-scale Dataset Translation. (Sugiura) [PDF]
- lm-jp-eval-mm: An Evaluation Framework for Evaluating Japanese-centric Vision and Language Model. (Sugiura) [PDF]
- LLM-jp-3 VILA: Construction of Japanese Multimodal Data and Powerful Japanese Multimodal Model (Sasagawa) [PDF]
<Model Building WG>
- Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization (Nakamura) [PDF]
<Safety WG>
- Large-Scale Human Evaluation of LLM Safety (Takahashi) [PDF]
- AnswerCarefully: AnswerCarefully: A Dataset for Promoting Safety of Japanese LLMs (Suzuki)[PDF]
- Developing a Dataset of Misinformation from Social Media and an Accuracy Benchmark for Large Language Models (Nakazato)[PDF]
- Development of Prompt Attack Data Collection Application for LLMs and Analysis of Collected Data Characteristics (Hayashi)[PDF]
<Corpus Construction WG>
- A Comprehensive Analysis of Memorization in Large Language Models (Kiyomaru) [PDF]
- Detection of Sensitive Personal Information in the Pre-training Corpus for Large Language Models (Minamoto) [PDF]
- Integrated Framework for LLM Domain Adaptation Based on Synthetic Data (Ogawa) [PDF]
Participants
27 on-site and about 88 online