Tony Shin
2.62K subscribers
1:16
VeRA: Vector-based Random Matrix Adaptation
Tony Shin
260 views • 8 months ago
0:50
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Tony Shin
433 views • 8 months ago
1:25
HyperAttention: Long-context Attention in Near-Linear Time
Tony Shin
373 views • 8 months ago
1:15
Fast Feedforward Networks
Tony Shin
418 views • 8 months ago
1:14
Nougat: Neural Optical Understanding for Academic Documents
Tony Shin
238 views • 8 months ago
1:05
Retentive Network: A Successor to Transformer for Large Language Models
Tony Shin
797 views • 8 months ago
1:09
LLava: Visual Instruction Tuning
Tony Shin
899 views • 8 months ago
1:56
BloombergGPT: A Large Language Model for Finance
Tony Shin
424 views • 1 year ago
3:02
ImageBind: One Embedding Space To Bind Them All
Tony Shin
878 views • 1 year ago
2:00
Segment Anything
Tony Shin
470 views • 1 year ago
2:17
Are Emergent Abilities of Large Language Models a Mirage?
Tony Shin
2.6K views • 1 year ago
1:12
Synthetic Data Boosts ImageNet Classification
Tony Shin
214 views • 1 year ago
0:47
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Tony Shin
718 views • 1 year ago
23:34
[Tutorial] Image Super Resolution without Photoshop
Tony Shin
1.1K views • 2 years ago
10:32
YOLO9000: Better, Faster, Stronger
Tony Shin
1.1K views • 2 years ago
15:10
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Tony Shin
1K views • 2 years ago
10:27
Florence: A New Foundation Model for Computer Vision
Tony Shin
1.2K views • 2 years ago
8:03
DSSD: Deconvolutional Single Shot Detector
Tony Shin
585 views • 2 years ago
8:02
MAE: Masked Autoencoders Are Scalable Vision Learners
Tony Shin
4.8K views • 2 years ago
5:01
PVANet: Deep but Lightweight Neural Networks forReal-time Object Detection
Tony Shin
374 views • 2 years ago
5:36
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Tony Shin
4.2K views • 2 years ago
6:32
R-FCN: Object Detection via Region-based Fully Convolutional Networks
Tony Shin
1K views • 2 years ago
5:28
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Tony Shin
2.2K views • 2 years ago
7:33
Pix2Seq: A Language Modeling Framework for Object Detection
Tony Shin
1.6K views • 2 years ago
2:41
Improved Regularization of Convolutional Neural Networks with Cutout
Tony Shin
395 views • 2 years ago
7:13
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning
Tony Shin
2.6K views • 3 years ago
3:23
SSD: Single Shot MultiBox Detector
Tony Shin
5.6K views • 3 years ago
4:33
Barlow Twins: Self-Supervised Learning via Redundancy Reduction
Tony Shin
2K views • 3 years ago
5:22
MLP-Mixer: An all-MLP Architecture for Vision
Tony Shin
1.6K views • 3 years ago
4:09
YOLO: Unified, Real-Time Object Detection
Tony Shin
850 views • 3 years ago
Load More