Monitoring ChatGPT Drifts Reveals Substantial Behavior Changes Over Time

AI summary: Stanford and UC Berkeley researchers found significant behavioral changes in large language models (LLMs)…

Announcing MPT-7B-8K: 8K Context Length for Document Understanding

AI summary: The MPT-7B-8K, a 7B parameter open-source language learning model (LLM) with an 8k context…

Microsoft’s AI-powered Copilot is coming to Microsoft Teams phone and chat

AI summary: Microsoft is expanding its AI-powered Microsoft 365 Copilot feature into the calls and chat…

Meta and Microsoft Introduce the Next Generation of Llama

AI summary: Meta is open-sourcing its AI model, Llama 2, for research and commercial use, in…

Wix will let you build an entire website using only AI prompts

AI summary: Wix, the website builder, is developing an AI Site Generator that creates websites based…

Faster Transformers for Longer Context with FlashAttention-2

AI summary: Stanford University researchers have developed FlashAttention-2, a technique that accelerates the training of large…

Retentive Networks: The Next Evolution of Transformers for AI?

AI summary: Microsoft researchers propose a new neural network architecture, Retentive Networks (RetNets), that could outperform…

curated-transformers: 🤖 A PyTorch library

AI summary: Curated Transformers is a new PyTorch library offering state-of-the-art transformer models built from reusable…

Meta claims its new art-generating model is best-in-class

AI summary: Meta has announced CM3Leon, an AI model that excels in text-to-image generation. Unlike most…

China mandates that AI must follow “core values of socialism”

AI summary: China’s Cyberspace Administration has issued new guidelines for generative AI services, limiting public use…

Claude 2: ChatGPT rival launches chatbot that can summarise a novel

AI summary: US-based AI company, Anthropic, has launched a chatbot, Claude 2, that can summarize large…

GPT4- All Details Leaked

AI summary: Leaked details about GPT4 reveal a model size of 1.8 trillion parameters across 120…

LLM agents and integration dead-ends

AI summary: The integration of large language models (LLMs) into business applications could unlock significant economic…

Transformers Learn Math: The Power of Random Initialization

Real Photo Disqualified From Photography Contest For Being AI

AI summary: A photograph taken by Suzi Dougherty was disqualified from a competition held by Charing…

GPT-4 Architecture, Infrastructure, Training Dataset, Costs, Vision, MoE

AI summary: OpenAI’s GPT-4 model architecture is not a secret, but a replicable solution with complex…

Meet LongLLaMA: A Large Language Model Capable of Handling Long Contexts of 256k Tokens

AI summary: Researchers have developed the Focused Transformer (FOT), a technique that addresses the challenge of…

New AI tool can help treat brain tumors more quickly and accurately, study finds

AI summary: Harvard Medical School researchers have developed an artificial intelligence (AI) tool that could improve…

Machine learning enables accurate electronic structure calculations at large scales for material modeling

AI summary: Researchers from CASUS at HZDR, Germany, and Sandia National Laboratories, U.S., have developed a…

torchscale: Transformers at any scale