
GGML Deep Dive VII: Tensor Representaion and Memory Layout
Introduction In previous posts, we’ve encountered the concept of tensors in GGML many times. However, we’ve only explored their simplest usage—cases without quantization, without permutation (where the tensor has a contiguous in-memory layout), and without tensor views. In more complex scenarios, tensors exhibit far more intricate behaviors, sometimes even counterintuitive ones. In this post, I’ll take a deeper dive into how tensors work in GGML. ggml_tensor Data Structure Overview First, let’s take a look at the ggml_tensor data structure defined in include/ggml.h. Here are some fields that you should pay attention to: ...