python内存申请机制

与C同行 2022-12-10

716

好久没更了，给大家续上。

今天的主角是内存申请，对于想深入了解python的同志来说，学习python的内存申请是必不可少的一环。内存，这是一个高深的话题，直到今天，我才能简单介绍一下它，内存就好像空间，容器一样，水杯可以放水，碗里可以盛放食物，而我们的数据或者代码也需要一个空间，这个空间可以用内存来笼统解释。c语言的内存可以用指针来表示，cpython用c语言构建，所以python内存的底层也是指针控制的。

既然知道了cpython的内存具体控制是指针，我们就可以继续研究下去。首先，我们总结一下python几种申请的内存方法，第一，c语言原始申请方式；第二，python细粒化内存块申请方式；第三，python跟踪内存申请方式。这些名称是我自己起的，然后，我们来了解它们。

c语言原始内存申请

这种申请内存的方式基本上跟c申请内存一致，其中一点不一样是对内存大小有要求，最少的内存为1个字节，接下来出现的c代码都是cpython3.8.10里面的源码，我们看源码：

static void *
_PyMem_RawMalloc(void *ctx, size_t size)
{
    /* PyMem_RawMalloc(0) means malloc(1). Some systems would return NULL
       for malloc(0), which would be treated as an error. Some platforms would
       return a pointer with no memory behind it, which would break pymalloc.
       To solve these problems, allocate an extra byte. */
    if (size == 0)
        size = 1;
    return malloc(size);
}


static void *
_PyMem_RawCalloc(void *ctx, size_t nelem, size_t elsize)
{
    /* PyMem_RawCalloc(0, 0) means calloc(1, 1). Some systems would return NULL
       for calloc(0, 0), which would be treated as an error. Some platforms
       would return a pointer with no memory behind it, which would break
       pymalloc.  To solve these problems, allocate an extra byte. */
    if (nelem == 0 || elsize == 0) {
        nelem = 1;
        elsize = 1;
    }
    return calloc(nelem, elsize);
}


static void *
_PyMem_RawRealloc(void *ctx, void *ptr, size_t size)
{
    if (size == 0)
        size = 1;
    return realloc(ptr, size);
}


static void
_PyMem_RawFree(void *ctx, void *ptr)
{
    free(ptr);
}

这里分别对应c语言的内存函数malloc，calloc，realloc，free。

python细粒化内存申请

python说我都是一门高级语言了，还玩这种古老的内存申请是不是有点掉价，并且python中对象这么多，这样申请太浪费内存和时间，那怎么办呢？如果想高效利用内存，最直接的办法就是减少开辟过程，也就是提前申请内存，对内存进行动态管理，这在很多现代语言中都有。讲到这里，我们要理解python中的三个定义，第一，arena_object，这是一个内存区域块，看cpython3.8.10的源码，其内存区域块大小是256k，第二，block，这个是8byte的别称，第三，pool_header，这个是最重要的了，是python的内存池，由于操作系统的页是4k，所以python的内存池大小为4k，我们主要看一下这个内存池的定义：

/* Pool for small blocks. */
struct pool_header {
    union { block *_padding;
            uint count; } ref;          /* number of allocated blocks    */
    block *freeblock;                   /* pool's free list head         */
    struct pool_header *nextpool;       /* next pool of this size class  */
    struct pool_header *prevpool;       /* previous pool       ""        */
    uint arenaindex;                    /* index into arenas of base adr */
    uint szidx;                         /* block size class index        */
    uint nextoffset;                    /* bytes to virgin block         */
    uint maxnextoffset;                 /* largest valid nextoffset      */
};

这里我们需要再了解一个东西，python对于小于512byte的内存申请统一用细粒化内存申请方式，并且每个申请的内存大小都是block的倍数，512/8=64，也就是说会有64种内存大小，如果弄一个映射，就是0:8byte、1:16btye、… 、63:512byte。

接下来解释一下内存池各成员：

ref：该内存池中内存块收集的个数；
block：未申请的下一内存块的地址；
nextpool：下一个内存池（双链）；
prevpool：上一个内存池（双链）；
arenaindex：该内存池属于哪个内存区域块；
szidx：内存块属于哪个序列，比如16byte属于序号1，512byte属于序号63，每个内存池该数字唯一，所以一个内存池只能分配一种内存块大小的内存块；
nextoffset：内存池分配内存块时移动位置；
maxnextoffset：内存池最大内存块分配位置，nextoffset大于该位置，表明该内存池已经满了。

python跟踪内存申请

我们都知道python的内存释放是引用计数，但是这不全对，因为引用计数有一个致命缺陷是循环引用，比如：

a = []; b = []; a[0] = b; b[0] = a

这种情况下，a和b的引用计数永远不能为0，无法被释放资源。为了解决这个问题，python引入了gc模块，使用了标记-清除算法，具体细节我们省略，了解大概原理。首先，我们要明白循环引用的本质是容器才有的，比如列表、字典这些才会一个容器套另外一个容器，所以python对于容器，需要一套特别的内存申请机制，叫做gc内存申请：

static PyObject *
_PyObject_GC_Alloc(int use_calloc, size_t basicsize)
{
    struct _gc_runtime_state *state = &_PyRuntime.gc;
    PyObject *op;
    PyGC_Head *g;
    size_t size;
    if (basicsize > PY_SSIZE_T_MAX - sizeof(PyGC_Head))
        return PyErr_NoMemory();
    size = sizeof(PyGC_Head) + basicsize;
    if (use_calloc)
        g = (PyGC_Head *)PyObject_Calloc(1, size);
    else
        g = (PyGC_Head *)PyObject_Malloc(size);
    if (g == NULL)
        return PyErr_NoMemory();
    assert(((uintptr_t)g & 3) == 0);  // g must be aligned 4bytes boundary
    g->_gc_next = 0;
    g->_gc_prev = 0;
    state->generations[0].count++; /* number of allocated GC objects */
    if (state->generations[0].count > state->generations[0].threshold &&
        state->enabled &&
        state->generations[0].threshold &&
        !state->collecting &&
        !PyErr_Occurred()) {
        state->collecting = 1;
        collect_generations(state);
        state->collecting = 0;
    }
    op = FROM_GC(g);
    return op;
}

我们可以看到gc申请内存会多申请一个PyGC_Head，这个是双向链表，将gc申请的内存全部链接起来：

/* GC information is stored BEFORE the object structure. */
typedef struct {
    // Pointer to next object in the list.
    // 0 means the object is not tracked
    uintptr_t _gc_next;


    // Pointer to previous object in the list.
    // Lowest two bits are used for flags documented later.
    uintptr_t _gc_prev;
} PyGC_Head;

gc申请的内存因为这个链表会被gc模块进行垃圾回收，我们暂时不讲这些。

到这里我们可以将这三种内存分配模式结合起来了，我们看对象是如何分配内存的：

PyObject *
PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems)
{
    PyObject *obj;
    const size_t size = _PyObject_VAR_SIZE(type, nitems+1);
    /* note that we need to add one, for the sentinel */


    if (PyType_IS_GC(type))
        obj = _PyObject_GC_Malloc(size);
    else
        obj = (PyObject *)PyObject_MALLOC(size);


    if (obj == NULL)
        return PyErr_NoMemory();


    memset(obj, '\0', size);


    if (type->tp_itemsize == 0)
        (void)PyObject_INIT(obj, type);
    else
        (void) PyObject_INIT_VAR((PyVarObject *)obj, type, nitems);


    if (PyType_IS_GC(type))
        _PyObject_GC_TRACK(obj);
    return obj;
}

这里可以看出来如果是gc对象，先进行gc内存分配方式，不是gc对象再进行细粒化内存分配方式，我们再看一下PyObject_MALLOC：

static void *
_PyObject_Malloc(void *ctx, size_t nbytes)
{
    void* ptr = pymalloc_alloc(ctx, nbytes);
    if (ptr != NULL) {
        _Py_AllocatedBlocks++;
        return ptr;
    }


    ptr = PyMem_RawMalloc(nbytes);
    if (ptr != NULL) {
        _Py_AllocatedBlocks++;
    }
    return ptr;
}

这里PyObject_MALLOC宏会调用_PyObject_Malloc函数，pymalloc_alloc函数内容太多，这里简单介绍一下，细粒化内存分配方式如果内存大于512byte或者分配失败，则用c原始分配方式。这次讲到gc模块没有深入研究给大家留个尾巴，下一次我们研究gc的全过程。

文章转载自与C同行，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

python内存申请机制

python细粒化内存申请

python跟踪内存申请

评论