好久没更了,给大家续上。
今天的主角是内存申请,对于想深入了解python的同志来说,学习python的内存申请是必不可少的一环。内存,这是一个高深的话题,直到今天,我才能简单介绍一下它,内存就好像空间,容器一样,水杯可以放水,碗里可以盛放食物,而我们的数据或者代码也需要一个空间,这个空间可以用内存来笼统解释。c语言的内存可以用指针来表示,cpython用c语言构建,所以python内存的底层也是指针控制的。
既然知道了cpython的内存具体控制是指针,我们就可以继续研究下去。首先,我们总结一下python几种申请的内存方法,第一,c语言原始申请方式;第二,python细粒化内存块申请方式;第三,python跟踪内存申请方式。这些名称是我自己起的,然后,我们来了解它们。
static void *_PyMem_RawMalloc(void *ctx, size_t size){/* PyMem_RawMalloc(0) means malloc(1). Some systems would return NULLfor malloc(0), which would be treated as an error. Some platforms wouldreturn a pointer with no memory behind it, which would break pymalloc.To solve these problems, allocate an extra byte. */if (size == 0)size = 1;return malloc(size);}static void *_PyMem_RawCalloc(void *ctx, size_t nelem, size_t elsize){/* PyMem_RawCalloc(0, 0) means calloc(1, 1). Some systems would return NULLfor calloc(0, 0), which would be treated as an error. Some platformswould return a pointer with no memory behind it, which would breakpymalloc. To solve these problems, allocate an extra byte. */if (nelem == 0 || elsize == 0) {nelem = 1;elsize = 1;}return calloc(nelem, elsize);}static void *_PyMem_RawRealloc(void *ctx, void *ptr, size_t size){if (size == 0)size = 1;return realloc(ptr, size);}static void_PyMem_RawFree(void *ctx, void *ptr){free(ptr);}
python细粒化内存申请
/* Pool for small blocks. */struct pool_header {union { block *_padding;uint count; } ref; /* number of allocated blocks */block *freeblock; /* pool's free list head */struct pool_header *nextpool; /* next pool of this size class */struct pool_header *prevpool; /* previous pool "" */uint arenaindex; /* index into arenas of base adr */uint szidx; /* block size class index */uint nextoffset; /* bytes to virgin block */uint maxnextoffset; /* largest valid nextoffset */};
接下来解释一下内存池各成员:
ref:该内存池中内存块收集的个数;
block:未申请的下一内存块的地址;
nextpool:下一个内存池(双链);
prevpool:上一个内存池(双链);
arenaindex:该内存池属于哪个内存区域块;
szidx:内存块属于哪个序列,比如16byte属于序号1,512byte属于序号63,每个内存池该数字唯一,所以一个内存池只能分配一种内存块大小的内存块;
nextoffset:内存池分配内存块时移动位置;
maxnextoffset:内存池最大内存块分配位置,nextoffset大于该位置,表明该内存池已经满了。
python跟踪内存申请
我们都知道python的内存释放是引用计数,但是这不全对,因为引用计数有一个致命缺陷是循环引用,比如:
a = []; b = []; a[0] = b; b[0] = a
这种情况下,a和b的引用计数永远不能为0,无法被释放资源。为了解决这个问题,python引入了gc模块,使用了标记-清除算法,具体细节我们省略,了解大概原理。首先,我们要明白循环引用的本质是容器才有的,比如列表、字典这些才会一个容器套另外一个容器,所以python对于容器,需要一套特别的内存申请机制,叫做gc内存申请:
static PyObject *_PyObject_GC_Alloc(int use_calloc, size_t basicsize){struct _gc_runtime_state *state = &_PyRuntime.gc;PyObject *op;PyGC_Head *g;size_t size;if (basicsize > PY_SSIZE_T_MAX - sizeof(PyGC_Head))return PyErr_NoMemory();size = sizeof(PyGC_Head) + basicsize;if (use_calloc)g = (PyGC_Head *)PyObject_Calloc(1, size);elseg = (PyGC_Head *)PyObject_Malloc(size);if (g == NULL)return PyErr_NoMemory();assert(((uintptr_t)g & 3) == 0); // g must be aligned 4bytes boundaryg->_gc_next = 0;g->_gc_prev = 0;state->generations[0].count++; /* number of allocated GC objects */if (state->generations[0].count > state->generations[0].threshold &&state->enabled &&state->generations[0].threshold &&!state->collecting &&!PyErr_Occurred()) {state->collecting = 1;collect_generations(state);state->collecting = 0;}op = FROM_GC(g);return op;}
我们可以看到gc申请内存会多申请一个PyGC_Head,这个是双向链表,将gc申请的内存全部链接起来:
/* GC information is stored BEFORE the object structure. */typedef struct {// Pointer to next object in the list.// 0 means the object is not trackeduintptr_t _gc_next;// Pointer to previous object in the list.// Lowest two bits are used for flags documented later.uintptr_t _gc_prev;} PyGC_Head;
到这里我们可以将这三种内存分配模式结合起来了,我们看对象是如何分配内存的:
PyObject *PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems){PyObject *obj;const size_t size = _PyObject_VAR_SIZE(type, nitems+1);/* note that we need to add one, for the sentinel */if (PyType_IS_GC(type))obj = _PyObject_GC_Malloc(size);elseobj = (PyObject *)PyObject_MALLOC(size);if (obj == NULL)return PyErr_NoMemory();memset(obj, '\0', size);if (type->tp_itemsize == 0)(void)PyObject_INIT(obj, type);else(void) PyObject_INIT_VAR((PyVarObject *)obj, type, nitems);if (PyType_IS_GC(type))_PyObject_GC_TRACK(obj);return obj;}
static void *_PyObject_Malloc(void *ctx, size_t nbytes){void* ptr = pymalloc_alloc(ctx, nbytes);if (ptr != NULL) {_Py_AllocatedBlocks++;return ptr;}ptr = PyMem_RawMalloc(nbytes);if (ptr != NULL) {_Py_AllocatedBlocks++;}return ptr;}
文章转载自与C同行,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




