Sebastian Ruder

2024-05-13 • machine learning natural language processing performance benchmarks evaluation overfitting

The text discusses the evolving landscape of LLM evaluation, highlighting the challenges and concerns regarding the reliability of benchmarks. It emphasizes the need to mitigate memorization and overfitting in LLMs and suggests be...

2024-04-15 • chatbot multilingual natural language processing performance benchmarks enterprise software

Sebastian Ruder provides an update on his work at Cohere, discussing the launch of Command R and R+ models. He explains the significance of Chatbot Arena rankings and the challenges in NLP benchmarking. Command R+ outperforms GPT-...

2024-02-27 • translation technology natural language processing linguistics low-resource languages

The text discusses the recent achievement of Gemini 1.5 in machine translation, focusing on true zero-shot machine translation and the challenges of low-resource languages. It explores the creation of new translation benchmarks fo...

2024-02-12 • technology ethics remote work research natural language processing

The AI and NLP landscape has evolved over the last five years, leading to more diverse job opportunities. The integration of BERT-based representations and large language models has narrowed the gap between fundamental and applied...

2024-01-18 • machine learning ethics debate workshops research natural language processing emnlp

The text discusses the Big Picture Workshop at EMNLP 2023, which aimed to encourage the exploration of broader research narratives in the field of AI. It highlights the importance of understanding in-context learning, attention as...

2023-12-19 • operational efficiency research natural language processing reasoning performance benchmarks evaluation

The text discusses the paradigm shift in NLP research due to large language models (LLMs) and the challenges faced by researchers due to the high cost of fine-tuning and pre-training LLMs. It highlights five research directions th...

2023-12-05 • language learning research multilingual natural language processing emnlp

The EMNLP 2023 conference in Singapore is discussed, with a focus on trends in NLP research. The main topics include instruction-tuned language models, evaluation based on large language models, creative prompt usage, and multilin...

2023-12-01 • conference research natural language processing neurips

NeurIPS 2023, the biggest AI conference, is happening soon. The focus is on natural language processing (NLP) papers. The main trends include large language models (LLMs), synthetic setups for analysis, aligning models based on hu...

2023-11-15 • natural language processing dataset data quality evaluation auto-tuning model prompts

The text covers the latest generation of instruction-tuning datasets, including data sources, quality, domain and language coverage, dialog turns, and license terms. It also discusses the latest datasets, the importance of quality...

2023-10-09 • geospatial data gis temporal information geographic knowledge spatial reasoning temporal reasoning cross-cultural encoding

The post discusses the representation of space and time in language models (LLMs), focusing on a recent paper by Gurnee and Tegmark. It explores how LLMs encode spatial and temporal information, the accuracy of this representation...

2023-10-04 • machine learning natural language processing dataset fine-tuning tuning

The text discusses the concept of instruction tuning in NLP and ML, covering popular datasets for instruction tuning and the main differences between instruction tuning and standard supervised fine-tuning. It also provides example...

2023-08-28 • tools api language learning natural language processing

The text discusses the use of tools in language models (LLMs) to address their limitations and improve their capabilities. It explores the types of tools, benefits of tool use, recent developments, and future directions. The autho...

2023-08-21 • language learning generative ai publication norms

The text discusses the components and implications for building generative agents and publication norms and venues for large language models (LLMs). It explores the use of large language models in creating persona-based bots, simu...

2023-08-14 • natural language processing classification transformer large language models ai interpretability

The newsletter discusses the recent developments in NLP and large language models. It covers the limitations of the Transformer architecture, efficient attention methods, and the implications of long-sequence modeling for LLMs and...

2022-11-06 • language learning substack multimodal models text-to-image generation

The author discusses moving to Substack after Twitter announced the shutdown of Revue. They also talk about scaling up language models and text-based image generation, highlighting the challenges and potential of large language mo...

2022-04-16 • openai machine learning google natural language processing gpt models dall-e deepmind palm pilot glue chinchilla

The newsletter covers PaLM, DALL-E 2, and Chinchilla, chain-of-thought prompting, and the role of values and culture in NLP. It discusses the recent progress in ML and NLP, large pre-trained models, and the emergence of new NLP st...

2022-01-31 • machine learning natural language processing pretrained models vector embeddings

The text discusses highlights of 2021 in the field of machine learning and natural language processing, focusing on pre-trained models, new tasks, and graph machine learning. It also covers the importance of safety in pre-trained ...

2021-11-06 • multitasking natural language processing pretrained models emnlp acl 2021

The author has moved from DeepMind to Google Research and plans to continue working on multilingual NLP with a focus on under-represented languages. They discuss multi-task learning, pre-training objectives, and the recent papers ...

2021-08-02 • clip generative ai performance benchmarks icml collaboration augmentation self-supervised learning

The text covers a variety of topics including papers from ICML 2021, open collaboration in ML research, art generated by the CLIP model, leveraging information from the Internet in models, and new benchmarks in the style of GLUE. ...

2021-07-19 • natural language processing science communication copilot mooc augmentation human perception data augmentation rethinking ml papers workshop

The text discusses the biggest advances in technology, including GitHub Copilot, the Perceiver, and non-self-attention models. It also talks about the challenges of writing the newsletter and the need to strike the right balance b...