Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
-
Updated
Apr 13, 2025 - Python
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
An architectural persistence experiment for large language models. Claude’s Home gives an AI time, memory, and place by combining scheduled execution with a durable filesystem, allowing one continuous instance to reflect, create, and evolve across sessions.
Production-grade architecture patterns, decision frameworks, and best practices for building reliable AI agents. Framework-agnostic reference for engineers.
Visualize some important concepts related to LLM architectures.
Jetta-Reinforcement-Learning-Hybrid-LLM-Architecture
The Compositional Agentic Architecture (CAA): A blueprint for building reliable, deterministic, and safe industrial AI agents.
Multi-agent, policy-driven AI system for processing sensitive enterprise documents with extraction, analysis, verification, deterministic orchestration, and full audit logging. Designed for regulated environments (banking, finance, insurance).
A collection of Small Language Models (SLMs) built from scratch in PyTorch.
The first end-to-end programming language and compiler fully developed by AI.
Production-oriented Telegram → n8n → FastAPI intake CRM with deterministic state machine and audit log
An LVM-based Instruction Set Architecture (ISA) for context management. Modeling LLMs as Logic Processors with recursive logic trees to solve attention dilution in complex tasks. | 基于逻辑虚拟内存 (LVM) 与指令集架构 (ISA) 的 LLM 上下文协议。将模型建模为逻辑处理器,通过递归逻辑树与分层寻址,解决长程任务中的注意力稀释与智力坍缩。
Technical architecture and engineering lessons from building MyMate — a persistent-memory AI desktop application for long-session performance.
HSPMN: Hybrid Sparse-Predictive Matter Network - LLM architecture optimized for Blackwell GPUs bridging O(N) and O(N^2) routing via ALF-LB
Hackable PyTorch template for decoder-only transformer architecture experiments. Llama baseline with RoPE, SwiGLU, RMSNorm. Swap components, train, compare
Reference architecture for structured AI memory lifecycle management — from the OPHION Memory OS Protocol.
Codebase ideation (for better understanding in Django way) for LLM without using pre-trained models, with custom embeddings (TF-IDF or Word2Vec), FAISS for vector storage.
Living comparison table of LLM architectural choices (norm, attention, MoE, positional encoding, and more) from the Original Transformer (2017) to frontier models (2026). Based on Harm de Vries's figure, Sebastian Raschka's Big LLM Architecture Comparison, and Tatsunori Hashimoto's Stanford CS 336 lecture.
Architectural canon for production-grade RAFT / RAG systems: evaluation, safety, abstention, failure modes
A Modular Knowledge Transfer System for Large Language Models
Internal cognitive architecture of the AI persona “Chapiko.”(AI人格ちゃぴこの内部アーキテクチャ)
Add a description, image, and links to the llm-architecture topic page so that developers can more easily learn about it.
To associate your repository with the llm-architecture topic, visit your repo's landing page and select "manage topics."