Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
-
Updated
May 19, 2026 - Jupyter Notebook
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Papers about pretraining and self-supervised learning on Graph Neural Networks (GNN).
[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
Official Repository for the Uni-Mol Series Methods
EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
Pretraining and inference code for a large-scale depth-recurrent language model
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper
The official implementation of MARS: Unleashing the Power of Variance Reduction for Training Large Models
Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.
Best practice for training LLaMA models in Megatron-LM
Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"
Add a description, image, and links to the pretraining topic page so that developers can more easily learn about it.
To associate your repository with the pretraining topic, visit your repo's landing page and select "manage topics."