11/15-12/14 AI 新知

先前因為參加比賽，拖了好久沒發。不過持續收集的一大堆網址還是想跟大家分享。

11/15-12/14 AI 相關文章:

The real research behind the wild rumors about OpenAI's Q* project 針對 OpenAI 的 Q* 專案的可能猜測 (Q-Learning)

Elevating the developer experience on Windows with new AI tools and productivity tools 微軟的 Windows AI Studio 簡化 AI 程式開發

Introducing SDXL Turbo: A Real-Time Text-to-Image Generation Model Stability AI 用蒸餾達成超快速生圖，開源但非商業用途

Introducing Stable Video Diffusion Stability 由影像模型生成影片之開源模型

Make Pixels Dance: High-Dynamic Video Generation 文字或加影像生影片

PKU-YuanGroup/Video-LLaVA 開源理解影片內容之語言模型

Emu Video and Emu Edit: Our latest generative AI research milestones (meta.com) / Emu Video / Emu Edit Meta 由文字或圖像生成影片，還有圖像精確編輯

Imagine with Meta AI Meta 生圖線上工具

Acly/krita-ai-diffusion Krita 本機 AI 編輯圖片工具

Realtime generative AI art is here thanks to LCM-LoRA 邊畫邊超快速生成圖像

VectorArt.ai - Unlimited AI Generated Vector Images AI 建立向量圖像

Audiobox: Generating audio from voice and natural language prompts Meta 推出可以給一段錄音，加提示文字以生成新的音檔，可以線上玩 Audiobox

Introducing a suite of AI language translation models that preserve expression and improve streaming Meta 的串流跨語言翻譯語音模型，還能保留原始情感與風格 (可下載模型)

Transforming the future of music creation - Google DeepMind Google 整合至 YouTube 的 Lyria 音樂模型及工具

Voice Changer: Use AI To Change Your Voice For Free ElevenLabs 讓你以別人的聲音，以你的情緒及表達方式轉換出新語音

Common Voice Mozilla 收集多國語言語音並建立公開資料集

ReconFusion Google DeepMind 公開以幾張照片建立 3D 場景模型

ZipLoRA 以任何風格生成任何主題圖片

Large Vision Models 預測一連串影像的下一個影像

Visual Anagrams AI 建立錯視圖片，有些是在人類藝術家沒見過的類型

Welcoming Mistral, Phi, Jais, Code Llama, NVIDIA Nemotron, and more to the Azure AI Model Catalog Azure 增加支援許多 AI 模型

DeepSpeed/blogs/deepspeed-fastgen 微軟的 DeepSpeed-FastGen 高效吞吐量 LLM 框架

01-ai/Yi-34B-Chat - Hugging Face 零一萬物大語言模型開源下載

Anthropic \ Introducing Claude 2.1 支援 200K 上下文語言模型

Inflection-2: The Next Step Up 發表時號稱僅次於 GPT-4 之大語言模型

Paving the way to efficient architectures: StripedHyena-7B, open source models offering a glimpse into a world beyond Transformers 不使用 Transformer 的新語言模型架構，與 Llama 2/Mistral 7B 相當，支援更長上下文

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding | LMSYS Org 平行超前解碼 n-gram 加速 LLM 生成

Introducing Stable LM Zephyr 3B: A New Addition to Stable LM, Bringing Powerful LLM Assistants to Edge Devices Stability AI 3B LLM 小模型供邊緣裝置使用

Orca 2: Teaching Small Language Models How to Reason 微軟開源小模型學習推理能力

Simplifying Transformer Blocks 刪除 Transformer 內非必要區塊，在小模型下不明顯影響成果又減少參數量提升速度

Robot Dad - Untrod 跟孩子對話的機器人爸爸程式

Mozilla-Ocho/llamafile / llamafile is the new best way to run a LLM on your own computer 只要下載一個檔案，就可在任何機器上執行多模態模型

What would an LLM OS look like? 討論 LLM OS 的可能發展方向

LLM Visualization 以視覺化方式逐步理解小模型

Extracting Training Data from ChatGPT 用很簡單的提示就可以取得 LLM 的訓練內容之攻擊

Unnatural Error Correction: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text GPT-4 可以接近完美復原打亂字母句子，這是違反直覺的發現

Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks GPT-4V 仍無法如人類勝任圖形推理任務

Multifaceted: the linguistic echo chambers of LLMs Multifaceted 這個少見的單字被 LLM 特別強調，原因可能是因 LLM 文字已被 (靜態預先產生並) 預設嵌入各大網站 (然後又被拿來訓練)

Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure 給 GPT-4 高壓力的投資場景，它會策略性地欺騙使用者

Fast Llama 2 on CPUs With Sparse Fine-Tuning and DeepSparse - Neural Magic 稀疏微調量化的高效 CPU 推理

Exponentially Faster Language Modelling FastBERT 推論時只用 BERT 0.3% 神經元，速度快幾十倍，仍保留超過 95% 能力

NeumTry/NeumAI RAG 全面解決方案

Openlayer - The evaluation workspace for machine learning 從開發測試到佈署的評價工具

LM Studio - Discover, download, and run local LLMs 本機執行 LLM

Millions of new materials discovered with deep learning Google DeepMind AI 預測穩定的新材料

Dobb-E: On Bringing Robots Home 在一般的家庭中示範 5 分鐘，15 分鐘後機器人就可學習執行任務的開源機器人

GPT-4's potential in shaping the future of radiology - Microsoft Research GPT-4 在放射科的應用

protectai/ai-exploits 收集對 AI 漏洞之攻擊及掃描方法

The 6 Types of Conversations with Generative AI 整理出六種跟 LLM 對話的場景及建議設計

Practical Tips for Finetuning LLMs Using LoRA 介紹 LoRA/QLoRA 微調技術

lastmile-ai/aiconfig AIConfig 使用 JSON 管理 AI 程式之提示、模型與參數，避免程式一團亂

Frigate NVR 以開源本機 AI 於監控攝影機偵測物件

yl4579/StyleTTS2 以擴散方法達成高品質語音合成

Does GPT-4 Pass the Turing Test? GPT-4 在圖靈測試中最高達 40% (接近隨機亂猜的 50%)，而且常用 ChatGPT 的人並沒有得到更好的成績

God Help Us, Let's Try To Understand The Paper On AI Monosemanticity 用另一個小 AI 來解釋大 AI 的神經元

DALL-E image → GPT4 Vision → repeat | DALL-E Party 將 DALL-E 圖片給 GPT-4V 解釋，再餵給 DALL-E，如此循環，可見範例

AI system self-organises to develop features of brains of complex organisms | University of Cambridge 為 AI 施加模擬物理限制 (距離越遠通訊越困難)，它能發展出類似人腦特徵

If Creators Suing AI Companies Over Copyright Win, It Will Further Entrench Big Tech 如果創作者要求大公司付費取得授權，只會讓大公司更加獨大 (因為開源模型無法取得授權)，另會有負責收錢的協會將大部份的錢拿走，創作者只能拿到一點點剩的

AI and Trust - Schneier on Security 由人間信任的觀念，談到信任 AI 的危機。我覺得這篇談信任主題對我蠻有啟發的

A Failed AI Girlfriend Product, and My Lessons AI 語音交談產品最終因成人對話而完全失敗的案例