TutorialsPoint Unix/Linux 系统调用参考指南

原创 yBmZlQzJ 2023-08-09

1197

TutorialsPoint Unix/Linux 系统调用参考指南

来源：易百教程

Unix/Linux系统调用™

List of Unix, Linux System Calls
accept access acct add_key adjtimex afs_syscall alarm alloc_hugepages arch_prctl bdflush bind break brk cacheflush chdir chmod chown chroot clone2 clone close connect create_module creat dup2 dup epoll_create epoll_ctl epoll_wait execve exit_group _exit exit _Exit faccessat fattch fchdir fchmodat fchmod fchownat fchown fcntl fdatasync fdetach flock fork free_hugepages fstatat fstatfs fstat fstatvfs fsync ftruncate futex futimesat getcontext getcwd getdents getdomainname getdtablesize getegid geteuid getgid getgroups gethostid gethostname getitimer get_kernel_syms get_mempolicy getmsg getpagesize getpeername getpgid getpgrp getpid getpmsg getppid getpriority getresgid getresuid getrlimit get_robust_list getrusage getsid getsockname getsockopt get_thread_area gettid gettimeofday getuid getunwind gtty idle inb inb_p init_module inl inl_p inotify_add_watch inotify_init inotify_rm_watch insb insl insw intro inw inw_p io_cancel ioctl ioctl_list io_destroy io_getevents ioperm iopl	ioprio_get ioprio_set io_setup io_submit ipc isastream kexec_load keyctl kill killpg lchown linkat link listen _llseek llseek lock lookup_dcookie lseek lstat madvise mincore mkdirat mkdir mknodat mknod mlockall mlock mmap2 mmap modify_ldt mount move_pages mprotect mpx mq_getsetattr mremap msgctl msgget msgop msgrcv msgsnd msync multiplexer munlockall munlock munmap nanosleep _newselect nfsservctl nice obsolete oldfstat oldlstat oldolduname oldstat olduname openat open outb outb_p outl outl_p outsb outsl outsw outw outw_p path_resolution pause perfmonctl personality pipe pivot_root poll posix_fadvise ppoll prctl pread prof pselect ptrace putmsg putpmsg pwrite query_module quotactl readahead readdir read readlinkat readlink readv reboot recvfrom recv recvmsg remap_file_pages renameat rename request_key restart_syscall rmdir rtas rt_sigaction rt_sigpending rt_sigprocmask rt_sigqueueinfo rt_sigreturn rt_sigsuspend rt_sigtimedwait sbrk sched_getaffinity sched_getparam sched_get_priority_max sched_get_priority_min sched_getscheduler sched_rr_get_interval sched_setaffinity sched_setparam sched_setscheduler sched_yield	security select select_tut semctl semget semop semtimedop sendfile send sendmsg sendto setcontext setdomainname setegid seteuid setfsgid setfsuid setgid setgroups sethostid sethostname setitimer setpgid setpgrp setpriority setregid setresgid setresuid setreuid setrlimit set_robust_list setsid setsockopt set_thread_area set_tid_address settimeofday setuid setup sgetmask shmat shmctl shmdt shmget shmop shutdown sigaction sigaltstack signal sigpending sigprocmask sigqueue sigreturn sigsuspend sigtimedwait sigwaitinfo socketcall socket socketpair splice spu_create spufs spu_run ssetmask statfs64 statfs stat statvfs stime stty swapcontext swapoff swapon symlinkat symlink sync_file_range sync _syscall syscall syscalls _sysctl sysctl sysfs sysinfo syslog tee tgkill time timer_create timer_delete timer_getoverrun timer_gettime timer_settime times tkill truncate tux umask umount2 umount uname undocumented unimplemented unlinkat unlink unshare uselib ustat utime utimes vfork vhangup vm86 vm86old vmsplice vserver wait3 wait4 wait waitid waitpid write writev

开始学习 >> ：accept()函数 Unix/Linux

accept()函数

名称

accept - 接受连接套接字上

内容简介

#include <sys/types.h>

#include <sys/socket.h>

int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);

描述说明

accept()系统调用用于基于连接的套接字类型（SOCK_STREAM，SOCK_SEQPACKET）。提取完成连接队列中的第一个连接请求，创建一个新的连接套接字，并返回一个新的文件描述符，指该套接字。新创建的套接字处于监听状态。原始套接字 sockfd 不受此调用。

参数 sockfd 是一个套接字绑定到本地地址 bind(2) socket(2)，已创建侦听连接后 listen(2)。

参数addr是一个指向结构sockaddr。被填充在此结构的对等套接字的地址，作为已知的通信层。地址返回 addr 的确切格式由套接字的地址族（参见socket（2）及相应协议的手册页）。

addrlen 参数是一个值结果参数：最初它应该包含大小addr所指向的结构，在函数返回时将包含实际的长度（以字节为单位）返回的地址。当没有填写addr是NULL。

如果没有挂起的连接队列，并没有被标记为非阻塞套接字，accept() 将阻塞，直到建立连接。如果套接字被标记无阻塞，没有未完成连接队列上，accept() 失败，并出现错误EAGAIN。

为了通知传入连接在套接字上，那么可以使用select（2）或 orpoll（2）。当尝试一个新的连接，然后可以调用accept() 获取套接字，连接一个可读事件将被传递。另外，您还可以设置套接字提供SIGIO活动发生在一个socket时，详情参见socket（7）。

需要一个明确的确认，如 DECNET 对于某些协议，accept() 可以被看作是仅仅从队列中取出下一个连接请求，不意味着确认。确认可以正常的读或写上新的文件描述符，暗示和排斥反应，可通过关闭新的套接字暗示。目前只有DECNet有这样的Linux上的语义。

注意

可能并不总是等待一个连接后 SIGIO 交付 select(2) 或 poll(2) 因为连接可能已被删除，被称为异步网络错误或另一个线程 accept() 返回一个可读性事件。如果发生这种情况，那么调用将阻塞等待下一个连接到达。

为了确保 accept() 从未阻塞，通过套接字sockfd中需要有O_NONBLOCK标志设置（参见socket（7））。

返回值

如果成功，accept()返回一个非负的整数，这是一个接受套接字描述符。上的错误，则返回-1，errno设置为合适。

错误处理

Linux 的 accept() 传递已经挂起的网络错误，在新的socket accept() 错误代码。此行为不同于其他的BSD套接字实现。对于可靠运行的应用程序应该检测网络错误定义的协议后accept() ，并把它们像EAGAIN重试。在这些情况下，TCP/ IP是ENETDOWN ENOPROTOOPT EPROTO，EHOSTDOWN，ENONET，EHOSTUNREACH，EOPNOTSUPP，和ENETUNREACH的。

错误

accept()可能失败如下:

标签	描述
EAGAINorEWOULDBLOCK	The socket is marked non-blocking and no connections are present to be accepted.
EBADF	The descriptor is invalid.
ECONNABORTED	A connection has been aborted.
EINTR	The system call was interrupted by a signal that was caught before a valid connection arrived.
EINVAL	Socket is not listening for connections, or addrlen is invalid (e.g., is negative).
EMFILE	The per-process limit of open file descriptors has been reached.
ENFILE	The system limit on the total number of open files has been reached.
ENOTSOCK	The descriptor references a file, not a socket.
EOPNOTSUPP	The referenced socket is not of typeSOCK_STREAM.

accept() 可能会失败，如下:

标签	描述
EFAULT	The addr argument is not in a writable part of the user address space.
ENOBUFS, ENOMEM	Not enough free memory. This often means that the memory allocation is limited by the socket bufferlimits, not by the system memory.
EPROTO	Protocol error.

Linux accept() 可能会失败，如下:

标签	描述
EPERM	Firewall rules forbid connection.

此外，新的套接字的协议所定义的网络错误可能被返回。各种 Linux 内核可以返回其他错误，如ENOSR ESOCKTNOSUPPORT，EPROTONOSUPPORT ETIMEDOUT。在跟踪过程中，可能会出现值ERESTARTSYS。

遵循于

SVr4, 4.4BSD (accept() first appeared in 4.2BSD).

注意

最初是作为一个'‘int *’'声明 accept()的第三个参数（libc4和libc5和许多其他系统，如4.x的BSD，SunOS 4上，SGI）;下一个POSIX.1g标准草案希望改变它变成了'size_t*'，那是什么它是在SunOS5。后来POSIX汇票“socklen_t*”，这样做对单一Unix规范和glibc2。

另请参阅

下一篇：access()函数 Unix/Linux

access()函数

名称

access - 检查用户的权限的文件

内容简介

#include <unistd.h>

int access(const char *pathname, int mode);

描述

access()检查该进程是否将被允许读，写或测试存在的文件（或其他文件系统对象），其名称是路径名。如果 pathname 的符号链接文件权限这个符号链接所提到的测试.

mode 是一种包括一个或多个掩码 R_OK, W_OK, X_OK 和 F_OK.

R_OK, W_OK 和 X_OK 检查文件是否存在并具有读，写和执行权限，分别要求。 F_OK 只是要求检查存在的文件。

测试依赖于权限的目录中出现的文件路径 pathname ，并在途中遇到的符号链接的目录和文件的权限。

检查进程的真实的UID和GID，而不是ID作为实际尝试操作时的有效完成。这是为了让设置用户ID程序可以轻松地确定调用用户的权限。

只有访问位被选中，而不是文件类型或内容。因此，如果一个目录被发现是“可写，”它可能意味着文件可以在目录中创建，而不是作为一个文件可以写入该目录。同样，一个DOS文件可能被发现是“可执行文件”，但仍然会失败调用execve（2）调用。

如果过程中有适当的权限，执行可能表明，即使没有任何执行文件的权限位被设置为X_OK成功。

返回值

成功（所有请求的权限），则返回0。错误（至少一个位模式要求被拒绝的权限，或发生其他一些错误），则返回-1，errno设置为合适。

错误

access() 可能会失败，如果:

标签	描述
EACCES	The requested access would be denied to the file or search permission is denied for one of the directories in the path prefix of pathname. (See also path_resolution(2).)
ELOOP	Too many symbolic links were encountered in resolvingpathname.
ENAMETOOLONG	pathname is too long.
ENOENT	A directory component in pathname would have been accessible but does not exist or was a dangling symbolic link.
ENOTDIR	A component used as a directory in pathname is not, in fact, a directory.
EROFS	Write permission was requested for a file on a read-only filesystem.

access() 可能会失败，如果:

标签	描述
EFAULT	pathname points outside your accessible address space.
EINVAL	mode was incorrectly specified.
EIO	An I/O error occurred.
ENOMEM	Insufficient kernel memory was available.
ETXTBSY	Write access was requested to an executable which is being executed.

限制

access() 返回一个错误，如果没有在所请求的调用失败的访问类型，即使其他类型可能会成功。

access() 可能无法正常工作与UID映射NFS文件系统上启用UID映射，因为在服务器上完成，并从客户端隐藏，检查权限。

使用 access() 来检查用户是否被授权，例如打开一个文件之前，其实这样使用 open(2)创建一个安全漏洞，因为用户可能会利用检查并打开文件操作的间隔时间短。

C遵循于

SVr4, POSIX.1-2001, 4.3BSD

请另参阅

acct()函数

名称

acct - 切换或关闭进程记帐

内容简介

#include <unistd.h>

int acct(const char *filename);

描述

与现有的文件名作为参数调用时，占被打开，每个终止的进程的记录，被追加到文件名作为终止。参数为NULL 引起占用被关闭。

返回值

成功则返回0。错误则返回-1，errno 设置为合适。

错误

标签	描述
EACCES	Write permission is denied for the specified file, or search permission is denied for one of the directories in the path prefix of filename (see also path_resolution(2)), or filename is not a regular file.
EFAULT	filename points outside your accessible address space.
EIO	Error writing to the file filename.
EISDIR	filename is a directory.
ELOOP	Too many symbolic links were encountered in resolving filename.
ENAMETOOLONG	filename was too long.
ENFILE	The system limit on the total number of open files has been reached.
ENOENT	The specified filename does not exist.
ENOMEM	Out of memory.
ENOSYS	BSD process accounting has not been enabled when the operating system kernel was compiled. The kernel configuration parameter controlling this feature is CONFIG_BSD_PROCESS_ACCT.
ENOTDIR	A component used as a directory in filename is not in fact a directory.
EPERM	The calling process has insufficient privilege to enable process accounting. On Linux the CAP_SYS_PACCT capability is required.
EROFS	filename refers to a file on a read-only file system.
EUSERS	There are no more free file structures or we ran out of memory.

遵循于

SVr4, 4.3BSD (but not POSIX).

注意

没有账号产生的程序运行时发生崩溃。特别是无穷的过程从来没有账号。

add_key()函数

名称

add_key - 添加到内核的密钥管理机制一个键

内容简介

#include <keyutils.h> key_serial_t add_key(const char *type, const char *description, const void *payload, size_t plen, key_serial_t keyring);

描述

add_key() 要求内核给定类型和描述来创建或更新一个键，它的有效载荷plen 长度实例，将它安装到提名 keyringand，返回其序列号。

密钥类型可能会拒绝该数据，如果它是在错误的格式或以其他方式无效。

如果目标的钥匙圈已经包含匹配指定类型和描述，然后，如果密钥类型支持一个键，该键将被更新，而不是创建一个新的密钥，如果没有，将创建一个新的密钥，它将取代链接到现存的核心，从钥匙圈。

目的地钥匙圈序号可能是一个有效的钥匙圈，主调用写入权限，或者它可以是一个特殊的密钥环ID：

标签	描述
KEY_SPEC_THREAD_KEYRING	This specifies the caller’s thread-specific keyring.
KEY_SPEC_PROCESS_KEYRING	This specifies the caller’s process-specific keyring.
KEY_SPEC_SESSION_KEYRING	This specifies the caller’s session-specific keyring.
KEY_SPEC_USER_KEYRING	This specifies the caller’s UID-specific keyring.
KEY_SPEC_USER_SESSION_KEYRING	This specifies the caller’s UID-session keyring.

密钥类型

有很多可供选择的核心密钥管理代码的密钥类型，而这些可以被指定为这个函数：

标签	描述
“user”	Keys of the user-defined key type may contain a blob of arbitrary data, and thedescription may be any valid string, though it is preferred that the description be prefixed with a string representing the service to which the key is of interest and a colon (for instance “afs:mykey”). The payload may be empty or NULL for keys of this type.
“keyring”	Keyrings are special key types that may contain links to sequences of other keys of any type. If this interface is used to create a keyring, then a NULL payload should be specified, andplen should be zero.

返回值

成功 add_key() 返回序列号密钥，它创建或更新。错误将返回值-1并且errno将被设置为一个适当的错误。

错误

标签	描述
ENOKEY	The keyring doesn’t exist.
EKEYEXPIRED	The keyring has expired.
EKEYREVOKED	The keyring has been revoked.
EINVAL	The payload data was invalid.
ENOMEM	Insufficient memory to create a key.
EDQUOT	The key quota for this user would be exceeded by creating this key or linking it to the keyring.
EACCES	The keyring wasn’t available for modification by the user.

链接

虽然这是一个Linux系统调用，它是在libc中不存在，但可以发现合适的 libkey 工具。链接时，lkey 工具应指定给链接器。

另请参阅

adjtimex()函数

名称

adjtimex - 调内核时钟

内容简介

#include <sys/timex.h>

int adjtimex(struct timex *
buf
);

描述

Linux使用大卫L. Mills的时钟调整算法（参见RFC1305）。 adjtimex()系统调用读取和任选设置该算法的调整参数。这需要一个指针的TIMEX结构，更新内核参数字段值，并返回相同的结构与当前的内核值。这种结构的声明如下：

struct timex {
int modes; /* mode selector */
long offset; /* time offset (usec) */
long freq; /* frequency offset (scaled ppm) */
long maxerror; /* maximum error (usec) */
long esterror; /* estimated error (usec) */
int status; /* clock command/status */
long constant; /* pll time constant */
long precision; /* clock precision (usec) (read only) */
long tolerance; /* clock frequency tolerance (ppm)
(read only) */
struct timeval time; /* current time (read only) */
long tick; /* usecs between clock ticks */
};

“modes ”字段确定的参数，如果有的话就设置。它可能包含一个按位或组合的零个或多个以下bits：

#define ADJ_OFFSET 0x0001 /* time offset */
#define ADJ_FREQUENCY 0x0002 /* frequency offset */
#define ADJ_MAXERROR 0x0004 /* maximum time error */
#define ADJ_ESTERROR 0x0008 /* estimated time error */
#define ADJ_STATUS 0x0010 /* clock status */
#define ADJ_TIMECONST 0x0020 /* pll time constant */
#define ADJ_TICK 0x4000 /* tick value */
#define ADJ_OFFSET_SINGLESHOT 0x8001 /* old-fashioned adjtime() */

普通用户限制到零值模式mode。只有超级用户可以设置任何参数。

返回值

成功，adjtimex() 返回时钟状态：

#define TIME_OK 0 /* clock synchronized */
#define TIME_INS 1 /* insert leap second */
#define TIME_DEL 2 /* delete leap second */
#define TIME_OOP 3 /* leap second in progress */
#define TIME_WAIT 4 /* leap second has occurred */
#define TIME_BAD 5 /* clock not synchronized */

如果失败，adjtimex（）返回-1，并设置errno。

错误

标签	描述
EFAULT	buf does not point to writable memory.
EINVAL	An attempt is made to setbuf.offset to a value outside the range -131071 to +131071,or to set buf.status to a value other than those listed above,or to set buf.tick to a value outside the range 900000/HZ to 1100000/HZ, where HZ is the system timer interrupt frequency.
EPERM	buf.mode is non-zero and the caller does not have sufficient privilege.Under Linux the CAP_SYS_TIME capability is required.

遵循于

adjtimex() 是Linux特有的，并且不应该被用在程序准备移植. 查看adjtime（3）用于调整系统时钟的方法，更轻便，但弹性较差。

另请参阅

settimeofday (2)

afs_syscall()函数

名称

以下是Unix，Linux系统还没有实现的清单，写这个页面的时候的系统调用：

afs_syscall,
break,
fattach,
fdetach,
ftime,
getmsg,
getpmsg,
gtty,
isastream,
lock,
mpx,
multiplexer,
prof,
profil,
putmsg,
putpmsg,
security,
stty,
ulimit,
vserver

内容简介

未实现的系统调用。

描述

这些系统调用在Linux 2.4内核中没有实现。

返回值

这些系统调用总是返回-1，并设置

errno to ENOSYS.

注意

注意：ftime(3), profil(3) and ulimit(3) 实现了库函数。

系统调用如： alloc_hugepages(2), free_hugepages(2),ioperm(2), iopl(2), and vm86(2) 只存在于一定的架构。

系统调用如： ipc(2), create_module(2), init_module(2), anddelete_module(2) 只存在Linux内核时，内置支持他们。

另请参阅

obsolete (2)

alarm()函数

名称

alarm - 设置闹钟传递信号

内容简介

#include <unistd.h>

unsigned int alarm(unsigned int seconds);

描述

alarm() arranges for a SIGALRM signal to be delivered to the process in secondsseconds.

If seconds is zero, no new alarm() is scheduled.

In any event any previously set alarm() is cancelled.

返回值

alarm() 返回剩余的秒数，直到任何先前预定的报警是由于传递或零，如果没有先前预定的报警。

注意

alarm() and setitimer() share the same timer; calls to one will interfere with use of the other.

sleep() may be implemented using SIGALRM; mixing calls to alarm() and sleep() is a bad idea.

调度延迟，以往一样，导致执行任意数量的时间被推迟的进程。

系统中的每个进程都有一个私有的闹钟。这个闹钟很像一个计时器，可以设置在一定秒数后闹钟。时间一到，时钟就发送一个信号SIGALRM到进程。

函数原型：unsigned int alarm（unsigned int seconds);
头文件：#include<unistd.h>
函数说明: alarm()用来设置信号SIGALRM在经过参数seconds指定的秒数后，传送给目前的进程。如果参数seconds为0，则之前设置的闹钟会被取消，并将剩下的时间返回。
返回值：如果调用此alarm()前，进程已经设置了闹钟时间，则返回上一个闹钟时间的剩余时间，否则返回0。出错返回-1。

例1：

int main(int argc, char *argv[]) {

unsigned int timeleft;

printf( "Set the alarm and sleep\n" ); alarm( 10 ); sleep( 5 );

timeleft = alarm( 0 ); //获得上一个闹钟的剩余时间：5秒 printf( "\Time left before cancel, and rearm: %d\n", timeleft );

alarm( timeleft );

printf( "\Hanging around, waiting to die\n" ); pause(); //让进程暂停直到信号出现

return EXIT_SUCCESS;

}

运行结果：

首先打印 Set the alarm and sleep

5秒后打印 Time left before cancel, and rearm: 5

Hanging around, waiting to die

再经过5秒，程序结束

除非进程为SIGALRM设置了处理函数，否则信号将杀死这个进程。比较下例中signal(SIGALRM, wakeup);语句打开与关闭的区别。

例2：

static void timer(int sig) { static int count=0; count++;

printf("\ncount = %d\n", count);

if(sig == SIGALRM) { printf("timer\n"); }

signal(SIGALRM, timer); alarm(1);

if (count == 5) alarm(0); return; }

int main(int argc, char *argv[]) { signal(SIGALRM, timer); alarm(1); while(1);

}

计时器的另一个用途是调度一个在将来的某个时刻发生的动作同时做些其他事情。调度一个将要发生的动作很简单，通过调用alarm来设置计时器，然后继续做别的事情。当计时器计时到0时，信号发送，处理函数被调用。

遵循于

SVr4, POSIX.1-2001, 4.3BSD

另请参阅

alloc_hugepages()函数

名称

alloc_hugepages, free_hugepages - 分配或释放巨大的页面。

内容简介

void *alloc_hugepages(int key, void *addr, size_t len, int prot, int flag);

int free_hugepages(void *addr);

描述

The system calls alloc_hugepages() and free_hugepages() were introduced in Linux 2.5.36 and removed again in 2.5.54. They existed only on i386 and ia64 (when built with CONFIG_HUGETLB_PAGE). In Linux 2.4.20 the syscall numbers exist, but the calls return ENOSYS.

On i386 the memory management hardware knows about ordinary pages (4 KiB) and huge pages (2 or 4 MiB). Similarly ia64 knows about huge pages of several sizes. These system calls serve to map huge pages into the process’ memory or to free them again. Huge pages are locked into memory, and are not swapped.

The key parameter is an identifier. When zero the pages are private, and not inherited by children. When positive the pages are shared with other applications using the samekey, and inherited by child processes.

The addr parameter of free_hugepages() tells which page is being freed: it was the return value of a call to alloc_hugepages(). (The memory is first actually freed when all users have released it.) The addr parameter of alloc_hugepages() is a hint, that the kernel may or may not follow. Addresses must be properly aligned.

The len parameter is the length of the required segment. It must be a multiple of the huge page size.

The prot parameter specifies the memory protection of the segment. It is one of PROT_READ, PROT_WRITE, PROT_EXEC.

The flag parameter is ignored, unless key is positive. In that case, if flag is IPC_CREAT, then a new huge page segment is created when none with the given key existed. If this flag is not set, then ENOENT is returned when no segment with the given key exists.

返回值

On success, alloc_hugepages() returns the allocated virtual address, andfree_hugepages() returns zero. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
ENOSYS	The system call is not supported on this kernel.

遵循于

These calls existed only in Linux 2.5.36 through to 2.5.54. These calls are specific to Linux on Intel processors, and should not be used in programs intended to be portable. Indeed, the system call numbers are marked for reuse, so programs using these may do something random on a future kernel.

文件

/proc/sys/vm/nr_hugepages Number of configured hugetlb pages. This can be read and written.

/proc/meminfo Gives info on the number of configured hugetlb pages and on their size in the three variables HugePages_Total, HugePages_Free, Hugepagesize.

注意

The system calls are gone. Now the hugetlbfs filesystem can be used instead. Memory backed by huge pages (if the CPU supports them) is obtained by using mmap() to map files in this virtual filesystem.

The maximal number of huge pages can be specified using the hugepages= boot parameter.

arch_prctl()函数

名称

arch_prctl - 设置架构特定的线程状态

内容简介

#include <asm/prctl.h>

#include <sys/prctl.h>

int arch_prctl(int code, unsigned long addr)

描述

arch_prctl() 函数设置架构的具体进程或线程状态。代码选择一个子功能和参数地址传递给它。

x86-64的子函数是：

标签	描述
ARCH_SET_FS	Set the 64bit base for the FS register toaddr.
ARCH_GET_FS	Return the 64bit base value for theFS register of the current thread in theunsigned long pointed to by the address parameter
ARCH_SET_GS	Set the 64bit base for the GS register to addr.
ARCH_GET_GS	Return the 64bit base value for the GS register of the current thread in the unsigned long pointed to by the addressparameter.

错误

标签	描述
EFAULT	addr points to an unmapped address or is outside the process address space.
EINVAL	code is not a valid subcommand.
EPERM	addr is outside the process address space.

作者

Man page written by Andi Kleen.

遵循于

arch_prctl() 是一个Linux/x86-64的扩展，并且不应该被用在程序准备移植。

请另参阅

bdflush()函数

名称

bdflush - 启动，刷新，或调缓冲区脏刷新守护

内容简介

int bdflush(int func, long *address); int bdflush(int func, long data);

描述

bdflush() starts, flushes, or tunes the buffer-dirty-flush daemon. Only a privileged process (one with the CAP_SYS_ADMIN capability) may call bdflush().

If func is negative or 0, and no daemon has been started, then bdflush() enters the daemon code and never returns.

If func is 1, some dirty buffers are written to disk.

If func is 2 or more and is even (low bit is 0), then address is the address of a long word, and the tuning parameter numbered (func-2)/2 is returned to the caller in that address.

If func is 3 or more and is odd (low bit is 1), then data is a long word, and the kernel sets tuning parameter numbered (func-3)/2 to that value.

The set of parameters, their values, and their legal ranges are defined in the kernel source file fs/buffer.c.

返回值

If func is negative or 0 and the daemon successfully starts, bdflush() never returns. Otherwise, the return value is 0 on success and -1 on failure, with errno set to indicate the error.

错误

标签	描述
EBUSY	An attempt was made to enter the daemon code after another process has already entered.
EFAULT	address points outside your accessible address space.
EINVAL	An attempt was made to read or write an invalid parameter number, or to write an invalid value to a parameter.
EPERM	Caller does not have the CAP_SYS_ADMIN capability.

遵循于

bdflush() Linux特有的，并且不应该被用在程序准备移植。

另请参阅

bind()函数

名称

bind - 绑定一个名字到一个套接字

内容简介

#include

#include

int bind(int sockfd, const struct sockaddr *my_addr ", socklen_t " addrlen );

描述

bind() gives the socket sockfd the local address my_addr. my_addr is addrlen bytes long. Traditionally, this is called \(lqassigning a name to a socket.\(rq When a socket is created with socket(2), it exists in a name space (address family) but has no name assigned.

It is normally necessary to assign a local address using bind() before a SOCK_STREAMsocket may receive connections (see accept(2)).

The rules used in name binding vary between address families. Consult the manual entries in Section 7 for detailed information. For AF_INET see ip(7), for AF_INET6 seeipv6(7), for AF_UNIX see unix(7), for AF_APPLETALK see ddp(7), for AF_PACKET seepacket(7), for AF_X25 see x25(7) and for AF_NETLINK see netlink(7).

The actual structure passed for the my_addr argument will depend on the address family. The sockaddr structure is defined as something like:

#include

#include

#include

#include

#define MY_SOCK_PATH "/somepath"

int
main(int argc, char *argv[])
{
int sfd;
struct sockaddr_un addr;

sfd = socket(AF_UNIX, SOCK_STREAM, 0);
if (sfd == -1) {
perror("socket");
exit(EXIT_FAILURE);
}
memset(&addr, 0, sizeof(struct sockaddr_un));
/* Clear structure */
addr.sun_family = AF_UNIX;
strncpy(addr.sun_path, MY_SOCK_PATH,
sizeof(addr.sun_path) - 1);
if (bind(sfd, (struct sockaddr *) &addr,
sizeof(struct sockaddr_un)) == -1) {
perror("bind");
exit(EXIT_FAILURE);
}
...
}

The only purpose of this structure is to cast the structure pointer passed in my_addr in order to avoid compiler warnings. The following example shows how this is done when binding a socket in the Unix (AF_UNIX) domain:

struct sockaddr {
sa_family_t sa_family;
char sa_data[14];
}

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

Error Code	描述
EACCES	The address is protected, and the user is not the superuser.
EADDRINUSE	The given address is already in use.
EBADF	sockfd is not a valid descriptor.
EINVAL	The socket is already bound to an address.
ENOTSOCK	sockfd is a descriptor for a file, not a socket.
The following errors are specific to UNIX domain (AF_UNIX) sockets:
EACCES	Search permission is denied on a component of the path prefix. (See also path_resolution(2).)
EADDRNOTAVAIL	A non-existent interface was requested or the requested address was not local.
EFAULT	my_addr points outside the user’s accessible address space.
EINVAL	The addrlen is wrong, or the socket was not in the AF_UNIXfamily.
ELOOP	Too many symbolic links were encountered in resolving my_addr.
ENAMETOOLONG	my_addr is too long.
ENOENT	The file does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOTDIR	A component of the path prefix is not a directory.
EROFS	The socket inode would reside on a read-only file system.

BUGS

透明代理的选择没有被描述。

遵循于

SVr4, 4.4BSD (the bind() function first appeared in 4.2BSD).

注意

The third argument of bind() is in reality an int (and this is what 4.x BSD and libc4 and libc5 have). Some POSIX confusion resulted in the present socklen_t, also used by glibc. See also accept(2).

另请参阅

break未实现

名称

afs_syscall, break, fattach, fdetach, ftime, getmsg, getpmsg, gtty, isastream, lock, mpx, multiplexer, prof, profil, putmsg, putpmsg, security, stty, ulimit, vserver - 未实现的系统调用

内容简介

未实现的系统调用。

描述

这些系统调用在Linux 2.4内核中没有实现。

返回值

These system calls always return -1 and set errno to ENOSYS.

注意

Note that ftime(3), profil(3) and ulimit(3) are implemented as library functions.

Some system calls, like alloc_hugepages(2), free_hugepages(2), ioperm(2), iopl(2), and vm86(2) only exist on certain architectures.

Some system calls, like ipc(2), create_module(2), init_module(2), anddelete_module(2) only exist when the Linux kernel was built with support for them.

另请参阅

obsolete (2)

brk()函数

名称

brk, sbrk - 改变数据段大小

内容简介

#include

int brk(void *end_data_segment);
void *sbrk(intptr_t increment);

描述

brk() sets the end of the data segment to the value specified by end_data_segment, when that value is reasonable, the system does have enough memory and the process does not exceed its max data size (see setrlimit(2)).

sbrk() increments the program’s data space by increment bytes. sbrk() isn’t a system call, it is just a C library wrapper. Calling sbrk() with an increment of 0 can be used to find the current location of the program break.

返回值

On success, brk() returns zero, and sbrk() returns a pointer to the start of the new area. On error, -1 is returned, and errno is set to ENOMEM.

遵循于

4.3BSD; SUSv1, marked LEGACY in SUSv2, removed in POSIX.1-2001.

brk() and sbrk() are not defined in the C Standard and are deliberately excluded from the POSIX.1 standard (see paragraphs B.1.1.1.3 and B.8.3.3).

注意

Various systems use various types for the parameter of sbrk(). Common are int, ssize_t,ptrdiff_t, intptr_t.

另请参阅

cacheflush()函数

名称

cacheflush - 刷新指令和/或数据高速缓存的内容

内容简介

#include

int cacheflush(char *addr, int nbytes, int cache);

描述

cacheflush() 刷新指定的缓存（S）用户地址范围内的地址（地址为nbytes-1）的内容。缓存可能是：

标签	描述
ICACHE	Flush the instruction cache.
DCACHE	Write back to memory and invalidate the affected valid cache lines.
BCACHE	Same as (ICACHE\|DCACHE).

返回值

cacheflush() 成功返回0或-1错误。如果检测到错误，errno将指示错误。

错误

Error Code	描述
EFAULT	Some or all of the address range addr to (addr+nbytes-1) is not accessible.
EINVAL	cache parameter is not one of ICACHE, DCACHE, or BCACHE.

BUGS

目前的实现忽略addr和nbytes以论据。因此，总是刷新整个缓存。

注意

这个系统调用是仅适用于基于MIPS的系统。它不应该被用于准备移植的程序。

chdir()函数

chdir, fchdir - 改变工作目录

内容简介

#include

int chdir(const char *path);
int fchdir(int fd);

描述

chdir() changes the current working directory to that specified in path. fchdir() is identical to chdir(); the only difference is that the directory is given as an open file descriptor.

返回值

成功，则返回0。上的错误，则返回-1，errno设置为合适。

错误

Depending on the file system, other errors can be returned. The more general errors for chdir() are listed below:

Error Code	描述
EACCES	Search permission is denied for one of the directories in the path prefix of path. (See also path_resolution(2).)
EFAULT	path points outside your accessible address space.
EIO	An I/O error occurred.
ELOOP	Too many symbolic links were encountered in resolving path.
ENAMETOOLONG	path is too long.
ENOENT	The file does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOTDIR	A component of path is not a directory.
The general errors for fchdir() are listed below:
EACCES	Search permission was denied on the directory open on fd.
EBADF	fd is not a valid file descriptor.

注意

A child process created via fork(2) inherits its parent’s current working directory. The current working directory is left unchanged by execve(2).

The prototype for fchdir() is only available if _BSD_SOURCE is defined, or_XOPEN_SOURCE is defined with the value 500.

遵循于

SVr4, 4.4BSD, POSIX.1-2001.

另请参阅

chmod()函数

名称

chmod, fchmod - 更改文件的权限

内容简介

#include

#include

int chmod(const char *path, mode_t mode);
int fchmod(int fildes, mode_t mode);

描述

给定的文件路径或引用fildes的的模式改变。

所指定的“或”以下模式：

标签	描述
S_ISUID	04000 set user ID on execution
S_ISGID	02000 set group ID on execution
S_ISVTX	01000 sticky bit
S_IRUSR	00400 read by owner
S_IWUSR	00200 write by owner
S_IXUSR	00100 execute/search by owner
S_IRGRP	00040 read by group
S_IWGRP	00020 write by group
S_IXGRP	00010 execute/search by group
S_IROTH	00004 read by others
S_IWOTH	00002 write by others
S_IXOTH	00001 execute/search by others

The effective UID of the calling process must match the owner of the file, or the process must be privileged (Linux: it must have the CAP_FOWNER capability).

If the calling process is not privileged (Linux: does not have the CAP_FSETIDcapability), and the group of the file does not match the effective group ID of the process or one of its supplementary group IDs, the S_ISGID bit will be turned off, but this will not cause an error to be returned.

As a security measure, depending on the file system, the set-user-ID and set-group-ID execution bits may be turned off if a file is written. (On Linux this occurs if the writing process does not have the CAP_FSETID capability.) On some file systems, only the superuser can set the sticky bit, which may have a special meaning. For the sticky bit, and for set-user-ID and set-group-ID bits on directories, see stat(2).

On NFS file systems, restricting the permissions will immediately influence already open files, because the access control is done on the server, but open files are maintained by the client. Widening the permissions may be delayed for other clients if attribute caching is enabled on them.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

根据文件系统上的，其他错误，也可以返回，chmod() 更普遍的错误列举如下：

Error Code	描述
EACCES	Search permission is denied on a component of the path prefix. (See also path_resolution(2).)
EFAULT	path points outside your accessible address space.
EIO	An I/O error occurred.
ELOOP	Too many symbolic links were encountered in resolving path.
ENAMETOOLONG	path is too long.
ENOENT	The file does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOTDIR	A component of the path prefix is not a directory.
EPERM	The effective UID does not match the owner of the file, and the process is not privileged (Linux: it does not have theCAP_FOWNER capability).
EROFS	The named file resides on a read-only file system.
The general errors for fchmod() are listed below:
EBADF	The file descriptor fildes is not valid.
EIO	See above.
EPERM	See above.
EROFS	See above.

遵循于

4.4BSD, SVr4, POSIX.1-2001.

另请参阅

chown()函数

chown, fchown, lchown - 改变文件的所有权

内容简介

#include <sys/types.h>

#include <unistd.h>

int chown(const char *path, uid_t owner, gid_t group); int fchown(int fd, uid_t owner, gid_t group); int lchown(const char *path, uid_t owner, gid_t group);

描述

These system calls change the owner and group of the file specified by path or by fd. Only a privileged process (Linux: one with the CAP_CHOWN capability) may change the owner of a file. The owner of a file may change the group of the file to any group of which that owner is a member. A privileged process (Linux: with CAP_CHOWN) may change the group arbitrarily.

If the owner or group is specified as -1, then that ID is not changed. When the owner or group of an executable file are changed by a non-superuser, the S_ISUID and S_ISGID mode bits are cleared. POSIX does not specify whether this also should happen when root does the chown(); the Linux behaviour depends on the kernel version. In case of a non-group-executable file (with clear S_IXGRP bit) the S_ISGID bit indicates mandatory locking, and is not cleared by a chown().

返回值

成功，则返回0。上的错误，则返回-1，errno设置为合适。

错误

根据文件系统上的，其他错误，也可以返回 chown() 更一般的错误在下面列出。

Error Code	描述
EACCES	Search permission is denied on a component of the path prefix. (See also path_resolution(2).)
EFAULT	path points outside your accessible address space.
ELOOP	Too many symbolic links were encountered in resolving path.
ENAMETOOLONG	path is too long.
ENOENT	The file does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOTDIR	A component of the path prefix is not a directory.
EPERM	The calling process did not have the required permissions (see above) to change owner and/or group.
EROFS	The named file resides on a read-only file system.
The general errors for fchown() are listed below:
EBADF	The descriptor is not valid.
EIO	A low-level I/O error occurred while modifying the inode.
ENOENT	See above.
EPERM	See above.
EROFS	See above.

注意

In versions of Linux prior to 2.1.81 (and distinct from 2.1.46), chown() did not follow symbolic links. Since Linux 2.1.81, chown() does follow symbolic links, and there is a new system call lchown() that does not follow symbolic links. Since Linux 2.1.86, this new call (that has the same semantics as the old chown()) has got the same syscall number, and chown() got the newly introduced number.

The prototype for fchown() is only available if _BSD_SOURCE is defined.

遵循于

4.4BSD, SVr4, POSIX.1-2001. The 4.4BSD version can only be used by the superuser (that is, ordinary users cannot give away files).

限制

The chown() semantics are deliberately violated on NFS file systems which have UID mapping enabled. Additionally, the semantics of all system calls which access the file contents are violated, because chown() may cause immediate access revocation on already open files. Client side caching may lead to a delay between the time where ownership have been changed to allow access for a user and the time where the file can actually be accessed by the user on other clients.

另请参阅

chroot()函数

chroot - 改变根目录

内容简介

#include <unistd.h>

int chroot(const char *path);

描述

chroot() 改变根目录中指定的路径。此目录将用于与/开头的路径名。根目录继承当前进程的的所有子目录。

Only a privileged process (Linux: one with the CAP_SYS_CHROOT capability) may callchroot(2).This call changes an ingredient in the pathname resolution process and does nothing else.

This call does not change the current working directory, so that after the call ‘.’ can be outside the tree rooted at ‘/’. In particular, the superuser can escape from a ‘chroot jail’ by doing ‘mkdir foo; chroot foo; cd ..’.

This call does not close open file descriptors, and such file descriptors may allow access to files outside the chroot tree.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

Depending on the file system, other errors can be returned. The more general errors are listed below:

Error Code	描述
EACCES	Search permission is denied on a component of the path prefix. (See also path_resolution(2).)
EFAULT	path points outside your accessible address space.
EIO	An I/O error occurred.
ELOOP	Too many symbolic links were encountered in resolving path.
ENAMETOOLONG	path is too long.
ENOENT	The file does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOTDIR	A component of path is not a directory.
EPERM	The caller has insufficient privilege.

遵循于

SVr4, 4.4BSD, SUSv2 (marked LEGACY). This function is not part of POSIX.1-2001.

注意

A child process created via fork(2) inherits its parent’s root directory. The root directory is left unchanged by execve(2).

FreeBSD has a stronger jail() system call.

另请参阅

clone()函数

clone, __clone2 - 创建一个子进程

内容简介

#include

int clone(int (*fn)(void *), void *child_stack,
int flags, void *arg, ...
/* pid_t *pid, struct user_desc *tls
", pid_t *" ctid " */ );"

int __clone2(int (*fn)(void *), void *child_stack_base,
size_t stack_size, int flags, void *arg, ...
/* pid_t *pid, struct user_desc *tls
", pid_t *" ctid " */ );"
#include <sched.h>

描述

clone() creates a new process, in a manner similar to fork(2). It is actually a library function layered on top of the underlying clone() system call, hereinafter referred to assys_clone. A description of sys_clone is given towards the end of this page.

Unlike fork(2), these calls allow the child process to share parts of its execution context with the calling process, such as the memory space, the table of file descriptors, and the table of signal handlers. (Note that on this manual page, "calling process" normally corresponds to "parent process". But see the description of CLONE_PARENT below.)

The main use of clone() is to implement threads: multiple threads of control in a program that run concurrently in a shared memory space.

When the child process is created with clone(), it executes the function applicationfn(arg). (This differs from fork(2), where execution continues in the child from the point of the fork(2) call.) The fn argument is a pointer to a function that is called by the child process at the beginning of its execution. The arg argument is passed to the fn function.

When the fn(arg) function application returns, the child process terminates. The integer returned by fn is the exit code for the child process. The child process may also terminate explicitly by calling exit(2) or after receiving a fatal signal.

The child_stack argument specifies the location of the stack used by the child process. Since the child and calling process may share memory, it is not possible for the child process to execute in the same stack as the calling process. The calling process must therefore set up memory space for the child stack and pass a pointer to this space toclone(). Stacks grow downwards on all processors that run Linux (except the HP PA processors), so child_stack usually points to the topmost address of the memory space set up for the child stack.

The low byte of flags contains the number of the termination signal sent to the parent when the child dies. If this signal is specified as anything other than SIGCHLD, then the parent process must specify the __WALL or __WCLONE options when waiting for the child with wait(2). If no signal is specified, then the parent process is not signaled when the child terminates.

flags may also be bitwise-or’ed with zero or more of the following constants, in order to specify what is shared between the calling process and the child process:

标签	描述
CLONE_PARENT (since Linux 2.3.12)	If CLONE_PARENT is set, then the parent of the new child (as returned by getppid(2)) will be the same as that of the calling process. If CLONE_PARENT is not set, then (as with fork(2)) the child’s parent is the calling process. Note that it is the parent process, as returned bygetppid(2), which is signaled when the child terminates, so that if CLONE_PARENT is set, then the parent of the calling process, rather than the calling process itself, will be signaled.
CLONE_FS	If CLONE_FS is set, the caller and the child processes share the same file system information. This includes the root of the file system, the current working directory, and the umask. Any call to chroot(2), chdir(2), or umask(2) performed by the calling process or the child process also affects the other process. If CLONE_FS is not set, the child process works on a copy of the file system information of the calling process at the time of the clone() call. Calls to chroot(2),chdir(2), umask(2) performed later by one of the processes do not affect the other process.
CLONE_FILES	If CLONE_FILES is set, the calling process and the child processes share the same file descriptor table. Any file descriptor created by the calling process or by the child process is also valid in the other process. Similarly, if one of the processes closes a file descriptor, or changes its associated flags (using the fcntl(2) F_SETFD operation), the other process is also affected. If CLONE_FILES is not set, the child process inherits a copy of all file descriptors opened in the calling process at the time of clone(). (The duplicated file descriptors in the child refer to the same open file descriptions (seeopen(2)) as the corresponding file descriptors in the calling process.) Subsequent operations that open or close file descriptors, or change file descriptor flags, performed by either the calling process or the child process do not affect the other process.
CLONE_NEWNS (since Linux 2.4.19)	Start the child in a new namespace. Every process lives in a namespace. The namespace of a process is the data (the set of mounts) describing the file hierarchy as seen by that process. After a fork(2) orclone(2) where the CLONE_NEWNS flag is not set, the child lives in the same namespace as the parent. The system calls mount(2) and umount(2) change the namespace of the calling process, and hence affect all processes that live in the same namespace, but do not affect processes in a different namespace. After a clone(2) where the CLONE_NEWNS flag is set, the cloned child is started in a new namespace, initialized with a copy of the namespace of the parent. Only a privileged process (one having the CAP_SYS_ADMIN capability) may specify theCLONE_NEWNS flag. It is not permitted to specify bothCLONE_NEWNS and CLONE_FS in the same clone() call.
CLONE_SIGHAND	If CLONE_SIGHAND is set, the calling process and the child processes share the same table of signal handlers. If the calling process or child process calls sigaction(2) to change the behavior associated with a signal, the behavior is changed in the other process as well. However, the calling process and child processes still have distinct signal masks and sets of pending signals. So, one of them may block or unblock some signals usingsigprocmask(2) without affecting the other process. If CLONE_SIGHAND is not set, the child process inherits a copy of the signal handlers of the calling process at the time clone() is called. Calls to sigaction(2) performed later by one of the processes have no effect on the other process. Since Linux 2.6.0-test6, flags must also includeCLONE_VM if CLONE_SIGHAND is specified
CLONE_PTRACE	If CLONE_PTRACE is specified, and the calling process is being traced, then trace the child also (see ptrace(2)).
CLONE_UNTRACED (since Linux 2.5.46)	If CLONE_UNTRACED is specified, then a tracing process cannot force CLONE_PTRACE on this child process.
CLONE_STOPPED (since Linux 2.6.0-test2)	If CLONE_STOPPED is set, then the child is initially stopped (as though it was sent a SIGSTOP signal), and must be resumed by sending it a SIGCONT signal.
CLONE_VFORK	If CLONE_VFORK is set, the execution of the calling process is suspended until the child releases its virtual memory resources via a call to execve(2) or _exit(2) (as with vfork(2)). If CLONE_VFORK is not set then both the calling process and the child are schedulable after the call, and an application should not rely on execution occurring in any particular order.
CLONE_VM	If CLONE_VM is set, the calling process and the child processes run in the same memory space. In particular, memory writes performed by the calling process or by the child process are also visible in the other process. Moreover, any memory mapping or unmapping performed with mmap(2) or munmap(2) by the child or calling process also affects the other process. If CLONE_VM is not set, the child process runs in a separate copy of the memory space of the calling process at the time of clone(). Memory writes or file mappings/unmappings performed by one of the processes do not affect the other, as with fork(2).
CLONE_PID (obsolete)	If CLONE_PID is set, the child process is created with the same process ID as the calling process. This is good for hacking the system, but otherwise of not much use. Since 2.3.21 this flag can be specified only by the system boot process (PID 0). It disappeared in Linux 2.5.16.
CLONE_THREAD (since Linux 2.4.0-test8)	If CLONE_THREAD is set, the child is placed in the same thread group as the calling process. To make the remainder of the discussion of CLONE_THREAD more readable, the term "thread" is used to refer to the processes within a thread group. Thread groups were a feature added in Linux 2.4 to support the POSIX threads notion of a set of threads that share a single PID. Internally, this shared PID is the so-called thread group identifier (TGID) for the thread group. Since Linux 2.4, calls to getpid(2) return the TGID of the caller. The threads within a group can be distinguished by their (system-wide) unique thread IDs (TID). A new thread’s TID is available as the function result returned to the caller of clone(), and a thread can obtain its own TID using gettid(2). When a call is made to clone() without specifyingCLONE_THREAD, then the resulting thread is placed in a new thread group whose TGID is the same as the thread’s TID. This thread is the leader of the new thread group. A new thread created with CLONE_THREAD has the same parent process as the caller of clone() (i.e., likeCLONE_PARENT), so that calls to getppid(2) return the same value for all of the threads in a thread group. When a CLONE_THREAD thread terminates, the thread that created it using clone() is not sent a SIGCHLD (or other termination) signal; nor can the status of such a thread be obtained using wait(2). (The thread is said to be detached.) After all of the threads in a thread group terminate the parent process of the thread group is sent a SIGCHLD(or other termination) signal. If any of the threads in a thread group performs anexecve(2), then all threads other than the thread group leader are terminated, and the new program is executed in the thread group leader. If one of the threads in a thread group creates a child using fork(2), then any thread in the group can wait(2) for that child. Since Linux 2.5.35, flags must also includeCLONE_SIGHAND if CLONE_THREAD is specified. Signals may be sent to a thread group as a whole (i.e., a TGID) using kill(2), or to a specific thread (i.e., TID) usingtgkill(2). Signal dispositions and actions are process-wide: if an unhandled signal is delivered to a thread, then it will affect (terminate, stop, continue, be ignored in) all members of the thread group. Each thread has its own signal mask, as set bysigprocmask(2), but signals can be pending either: for the whole process (i.e., deliverable to any member of the thread group), when sent with kill(2); or for an individual thread, when sent with tgkill(2). A call tosigpending(2) returns a signal set that is the union of the signals pending for the whole process and the signals that are pending for the calling thread. If kill(2) is used to send a signal to a thread group, and the thread group has installed a handler for the signal, then the handler will be invoked in exactly one, arbitrarily selected member of the thread group that has not blocked the signal. If multiple threads in a group are waiting to accept the same signal using sigwaitinfo(2), the kernel will arbitrarily select one of these threads to receive a signal sent using kill(2).
CLONE_SYSVSEM (since Linux 2.5.10)	If CLONE_SYSVSEM is set, then the child and the calling process share a single list of System V semaphore undo values (see semop(2)). If this flag is not set, then the child has a separate undo list, which is initially empty.
CLONE_SETTLS (since Linux 2.5.32)	The newtls parameter is the new TLS (Thread Local Storage) descriptor. (See set_thread_area(2).)
CLONE_PARENT_SETTID(since Linux 2.5.49)	Store child thread ID at location parent_tidptr in parent and child memory. (In Linux 2.5.32-2.5.48 there was a flag CLONE_SETTID that did this.)
CLONE_CHILD_SETTID(since Linux 2.5.49)	Store child thread ID at location child_tidptr in child memory.
CLONE_CHILD_CLEARTID(since Linux 2.5.49)	Erase child thread ID at location child_tidptr in child memory when the child exits, and do a wakeup on the futex at that address. The address involved may be changed by the set_tid_address(2) system call. This is used by threading libraries.

sys_clone

The sys_clone system call corresponds more closely to fork(2) in that execution in the child continues from the point of the call. Thus, sys_clone only requires the flags andchild_stack arguments, which have the same meaning as for clone(). (Note that the order of these arguments differs from clone().)

Another difference for sys_clone is that the child_stack argument may be zero, in which case copy-on-write semantics ensure that the child gets separate copies of stack pages when either process modifies the stack. In this case, for correct operation, theCLONE_VM option should not be specified.

Since Linux 2.5.49 the system call has five parameters. The two new parameters areparent_tidptr which points to the location (in parent and child memory) where the child thread ID will be written in case CLONE_PARENT_SETTID was specified, and child_tidptrwhich points to the location (in child memory) where the child thread ID will be written in case CLONE_CHILD_SETTID was specified.

返回值

On success, the thread ID of the child process is returned in the caller’s thread of execution. On failure, a -1 will be returned in the caller’s context, no child process will be created, and errno will be set appropriately.

错误

标签	描述
EAGAIN	Too many processes are already running.
EINVAL	CLONE_SIGHAND was specified, but CLONE_VM was not. (Since Linux 2.6.0-test6.)
EINVAL	CLONE_THREAD was specified, but CLONE_SIGHAND was not. (Since Linux 2.5.35.)
EINVAL	Both CLONE_FS and CLONE_NEWNS were specified in flags.
EINVAL	Returned by clone() when a zero value is specified forchild_stack.
ENOMEM	Cannot allocate sufficient memory to allocate a task structure for the child, or to copy those parts of the caller’s context that need to be copied.
EPERM	CLONE_NEWNS was specified by a non-root process (process without CAP_SYS_ADMIN).
EPERM	CLONE_PID was specified by a process other than process 0.

VERSIONS

There is no entry for clone() in libc5. glibc2 provides clone() as described in this manual page.

遵循于

The clone() and sys_clone calls are Linux specific and should not be used in programs intended to be portable.

注意

In the kernel 2.4.x series, CLONE_THREAD generally does not make the parent of the new thread the same as the parent of the calling process. However, for kernel versions 2.4.7 to 2.4.18 the CLONE_THREAD flag implied the CLONE_PARENT flag (as in kernel 2.6).

For a while there was CLONE_DETACHED (introduced in 2.5.32): parent wants no child-exit signal. In 2.6.2 the need to give this together with CLONE_THREADdisappeared. This flag is still defined, but has no effect. On x86, clone() should not be called through vsyscall, but directly through int $0x80. On IA-64, a different system call is used:

int __clone2(int (*fn)(void *), void *child_stack_base,
size_t stack_size, int flags, void *arg, ...
/* pid_t *pid, struct user_desc *tls
", pid_t *" ctid " */ );"

The __clone2() system call operates in the same way as clone(), except thatchild_stack_base points to the lowest address of the child’s stack area, and stack_sizespecifies the size of the stack pointed to by child_stack_base.

BUGS

Versions of the GNU C library that include the NPTL threading library contain a wrapper function for getpid(2) that performs caching of PIDs. In programs linked against such libraries, calls to getpid(2) may return the same value, even when the threads were not created using CLONE_THREAD (and thus are not in the same thread group). To get the truth, it may be necessary to use code such as the following

#include <syscall.h>

pid_t mypid;

mypid = syscall(SYS_getpid);

另请参阅

close()函数

close - 关闭一个文件描述符

内容简介

#include <unistd.h>

int close(int fd);

描述

close() closes a file descriptor, so that it no longer refers to any file and may be reused. Any record locks (see fcntl(2)) held on the file it was associated with, and owned by the process, are removed (regardless of the file descriptor that was used to obtain the lock).

If fd is the last copy of a particular file descriptor the resources associated with it are freed; if the descriptor was the last reference to a file which has been removed using unlink(2) the file is deleted.

返回值

close() 成功返回零。上的错误，则返回-1，errno设置为合适。

错误

标签	描述
EBADF	fd isn’t a valid open file descriptor.
EINTR	The close() call was interrupted by a signal.
EIO	An I/O error occurred.

遵循于

SVr4, 4.3BSD, POSIX.1-2001.

注意

Not checking the return value of close() is a common but nevertheless serious programming error. It is quite possible that errors on a previous write(2) operation are first reported at the final close(). Not checking the return value when closing the file may lead to silent loss of data. This can especially be observed with NFS and with disk quota.

A successful close does not guarantee that the data has been successfully saved to disk, as the kernel defers writes. It is not common for a filesystem to flush the buffers when the stream is closed. If you need to be sure that the data is physically stored use fsync(2). (It will depend on the disk hardware at this point.)

另请参阅

connect()函数

connect - 发起连接在套接字上

内容简介

#include

#include

int connect(int sockfd,
const struct sockaddr *serv_addr,
socklen_t addrlen);

描述

The connect() system call connects the socket referred to by the file descriptor sockfd to the address specified by serv_addr. The addrlen argument specifies the size of serv_addr. The format of the address in serv_addr is determined by the address space of the socket sockfd; see socket(2) for further details.

If the socket sockfd is of type SOCK_DGRAM then serv_addr is the address to which datagrams are sent by default, and the only address from which datagrams are received. If the socket is of type SOCK_STREAM or SOCK_SEQPACKET, this call attempts to make a connection to the socket that is bound to the address specified byserv_addr.

Generally, connection-based protocol sockets may successfully connect() only once; connectionless protocol sockets may use connect() multiple times to change their association. Connectionless sockets may dissolve the association by connecting to an address with the sa_family member of sockaddr set to AF_UNSPEC.

返回值

如果连接或绑定成功，则返回0。上的错误，则返回-1，anderrno设置适当。

错误

以下是一般的套接字错误。有可能是其他域特定的错误代码。

Error Code	描述
EACCES	For Unix domain sockets, which are identified by pathname: Write permission is denied on the socket file, or search permission is denied for one of the directories in the path prefix. (See also path_resolution(2).)
EACCES, EPERM	The user tried to connect to a broadcast address without having the socket broadcast flag enabled or the connection request failed because of a local firewall rule.
EADDRINUSE	Local address is already in use.
EAFNOSUPPORT	The passed address didn’t have the correct address family in itssa_family field.
EADDRNOTAVAIL	Non-existent interface was requested or the requested address was not local.
EALREADY	The socket is non-blocking and a previous connection attempt has not yet been completed.
EBADF	The file descriptor is not a valid index in the descriptor table.
ECONNREFUSED	No one listening on the remote address.
EFAULT	The socket structure address is outside the user’s address space.
EINPROGRESS	The socket is non-blocking and the connection cannot be completed immediately. It is possible to select(2) or poll(2) for completion by selecting the socket for writing. After select(2) indicates writability, use getsockopt(2) to read the SO_ERRORoption at level SOL_SOCKET to determine whether connect() completed successfully (SO_ERROR is zero) or unsuccessfully (SO_ERROR is one of the usual error codes listed here, explaining the reason for the failure).
EINTR	The system call was interrupted by a signal that was caught.
EISCONN	The socket is already connected.
ENETUNREACH	Network is unreachable.
ENOTSOCK	The file descriptor is not associated with a socket.
ETIMEDOUT	Timeout while attempting connection. The server may be too busy to accept new connections. Note that for IP sockets the timeout may be very long when syncookies are enabled on the server.

遵循于

SVr4, 4.4BSD (the connect() function first appeared in 4.2BSD).

注意

The third argument of connect() is in reality an int (and this is what 4.x BSD and libc4 and libc5 have). Some POSIX confusion resulted in the present socklen_t, also used by glibc. See also accept(2).

BUGS

Unconnecting a socket by calling connect() with a AF_UNSPEC address is not yet implemented.

另请参阅

create_module()函数

create_module - 创建一个可加载模块项目

内容简介

#include <linux/module.h> caddr_t create_module(const char *name, size_t size);

描述

create_module() 尝试创建一个可加载模块项目，并预定将需要按住模块的内核内存。此系统调用需要的特权。

返回值

On success, returns the kernel address at which the module will reside. On error -1 is returned and errno is set appropriately.

错误

Error Code	描述
EEXIST	A module by that name already exists.
EFAULT	name is outside the program’s accessible address space.
EINVAL	The requested size is too small even for the module header information.
ENOMEM	The kernel could not allocate a contiguous block of memory large enough for the module.
EPERM	The caller was not privileged (did not have theCAP_SYS_MODULE capability).

遵循于

create_module() is Linux specific.

注意

这个系统调用是目前唯一在Linux2.4内核，直到它在Linux2.6中删除。

另请参阅

open()函数

open, creat - 打开并可能创建一个文件或设备

内容简介

#include

#include

#include

int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
int creat(const char *pathname, mode_t mode);

描述

Given a pathname for a file, open() returns a file descriptor, a small, non-negative integer for use in subsequent system calls (read(2), write(2), lseek(2), fcntl(2), etc.). The file descriptor returned by a successful call will be the lowest-numbered file descriptor not currently open for the process.

The new file descriptor is set to remain open across an execve(2) (i.e., theFD_CLOEXEC file descriptor flag described in fcntl(2) is initially disabled). The file offset is set to the beginning of the file (see lseek(2)).

A call to open() creates a new open file description, an entry in the system-wide table of open files. This entry records the file offset and the file status flags (modifiable via thefcntl() F_SETFL operation). A file descriptor is a reference to one of these entries; this reference is unaffected if pathname is subsequently removed or modified to refer to a different file. The new open file description is initially not shared with any other process, but sharing may arise via fork(2).

The parameter flags must include one of the following access modes: O_RDONLY,O_WRONLY, or O_RDWR. These request opening the file read-only, write-only, or read/write, respectively.

In addition, zero or more file creation flags and file status flags can be bitwise-or’d inflags. The file creation flags are O_CREAT, O_EXCL, O_NOCTTY, and O_TRUNC. The file status flags are all of the remaining flags listed below. The distinction between these two groups of flags is that the file status flags can be retrieved and (in some cases) modified using fcntl(2).

文件创建标志和文件状态标志的完整列表如下：

Error Code	描述
O_APPEND	The file is opened in append mode. Before each write(), the file offset is positioned at the end of the file, as if with lseek().O_APPEND may lead to corrupted files on NFS file systems if more than one process appends data to a file at once. This is because NFS does not support appending to a file, so the client kernel has to simulate it, which can’t be done without a race condition.
O_ASYNC	Enable signal-driven I/O: generate a signal (SIGIO by default, but this can be changed via fcntl(2)) when input or output becomes possible on this file descriptor. This feature is only available for terminals, pseudo-terminals, sockets, and (since Linux 2.6) pipes and FIFOs. See fcntl(2) for further details.
O_CREAT	If the file does not exist it will be created. The owner (user ID) of the file is set to the effective user ID of the process. The group ownership (group ID) is set either to the effective group ID of the process or to the group ID of the parent directory (depending on filesystem type and mount options, and the mode of the parent directory, see, e.g., the mount optionsbsdgroups and sysvgroups of the ext2 filesystem, as described inmount(8)).
O_DIRECT	Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user space buffers. The I/O is synchronous, i.e., at the completion of a read(2) or write(2), data is guaranteed to have been transferred. Under Linux 2.4 transfer sizes, and the alignment of user buffer and file offset must all be multiples of the logical block size of the file system. Under Linux 2.6 alignment must fit the block size of the device. A semantically similar (but deprecated) interface for block devices is described in raw(8).
O_DIRECTORY	If pathname is not a directory, cause the open to fail. This flag is Linux-specific, and was added in kernel version 2.1.126, to avoid denial-of-service problems if opendir(3) is called on a FIFO or tape device, but should not be used outside of the implementation of opendir.
O_EXCL	When used with O_CREAT, if the file already exists it is an error and the open() will fail. In this context, a symbolic link exists, regardless of where it points to. O_EXCL is broken on NFS file systems; programs which rely on it for performing locking tasks will contain a race condition. The solution for performing atomic file locking using a lockfile is to create a unique file on the same file system (e.g., incorporating hostname and pid), use link(2) to make a link to the lockfile. If link() returns 0, the lock is successful. Otherwise, use stat(2) on the unique file to check if its link count has increased to 2, in which case the lock is also successful.
O_LARGEFILE	(LFS) Allow files whose sizes cannot be represented in an off_t(but can be represented in an off64_t) to be opened.
O_NOATIME	(Since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time.
O_NOCTTY	If pathname refers to a terminal device — see tty(4) — it will not become the process’s controlling terminal even if the process does not have one.
O_NOFOLLOW	If pathname is a symbolic link, then the open fails. This is a FreeBSD extension, which was added to Linux in version 2.1.126. Symbolic links in earlier components of the pathname will still be followed.
O_NONBLOCK orO_NDELAY	When possible, the file is opened in non-blocking mode. Neither the open() nor any subsequent operations on the file descriptor which is returned will cause the calling process to wait. For the handling of FIFOs (named pipes), see also fifo(7). For a discussion of the effect of O_NONBLOCK in conjunction with mandatory file locks and with file leases, see fcntl(2).
O_SYNC	The file is opened for synchronous I/O. Any write()s on the resulting file descriptor will block the calling process until the data has been physically written to the underlying hardware.But see RESTRICTIONS below.
O_TRUNC	If the file already exists and is a regular file and the open mode allows writing (i.e., is O_RDWR or O_WRONLY) it will be truncated to length 0. If the file is a FIFO or terminal device file, the O_TRUNC flag is ignored. Otherwise the effect of O_TRUNC is unspecified.
Some of these optional flags can be altered using fcntl() after the file has been opened. The argument mode specifies the permissions to use in case a new file is created. It is modified by the process’s umask in the usual way: the permissions of the created file are (mode & ~umask). Note that this mode only applies to future accesses of the newly created file; the open() call that creates a read-only file may well return a read/write file descriptor.
The following symbolic constants are provided for mode:
S_IRWXU	00700 user (file owner) has read, write and execute permission
S_IRUSR	00400 user has read permission
S_IWUSR	00200 user has write permission
S_IXUSR	00100 user has execute permission
S_IRWXG	00070 group has read, write and execute permission
S_IRGRP	00040 group has read permission
S_IWGRP	00020 group has write permission
S_IXGRP	00010 group has execute permission
S_IRWXO	00007 others have read, write and execute permission
S_IROTH	00004 others have read permission
S_IWOTH	00002 others have write permission
S_IXOTH	00001 others have execute permission

mode must be specified when O_CREAT is in the flags, and is ignored otherwise.

creat() is equivalent to open() with flags equal to O_CREAT|O_WRONLY|O_TRUNC.

返回值

open() and creat() return the new file descriptor, or -1 if an error occurred (in which case, errno is set appropriately).

注意

Note that open() can open device special files, but creat() cannot create them; usemknod(2) instead.

On NFS file systems with UID mapping enabled, open() may return a file descriptor but e.g. read(2) requests are denied with EACCES. This is because the client performsopen() by checking the permissions, but UID mapping is performed by the server upon read and write requests.

If the file is newly created, its st_atime, st_ctime, st_mtime fields (respectively, time of last access, time of last status change, and time of last modification; see stat(2)) are set to the current time, and so are the st_ctime and st_mtime fields of the parent directory. Otherwise, if the file is modified because of the O_TRUNC flag, its st_ctime and st_mtime fields are set to the current time.

错误

Error Code	描述
EACCES	The requested access to the file is not allowed, or search permission is denied for one of the directories in the path prefix of pathname, or the file did not exist yet and write access to the parent directory is not allowed. (See also path_resolution(2).)
EEXIST	pathname already exists and O_CREAT and O_EXCL were used.
EFAULT	pathname points outside your accessible address space.
EISDIR	pathname refers to a directory and the access requested involved writing (that is, O_WRONLY or O_RDWR is set).
ELOOP	Too many symbolic links were encountered in resolvingpathname, or O_NOFOLLOW was specified but pathname was a symbolic link.
EMFILE	The process already has the maximum number of files open.
ENAMETOOLONG	pathname was too long.
ENFILE	The system limit on the total number of open files has been reached.
ENODEV	pathname refers to a device special file and no corresponding device exists. (This is a Linux kernel bug; in this situation ENXIO must be returned.)
ENOENT	O_CREAT is not set and the named file does not exist. Or, a directory component in pathname does not exist or is a dangling symbolic link.
ENOMEM	Insufficient kernel memory was available.
ENOSPC	pathname was to be created but the device containingpathname has no room for the new file.
ENOTDIR	A component used as a directory in pathname is not, in fact, a directory, or O_DIRECTORY was specified and pathname was not a directory.
ENXIO	O_NONBLOCK \| O_WRONLY is set, the named file is a FIFO and no process has the file open for reading. Or, the file is a device special file and no corresponding device exists.
EOVERFLOW	pathname refers to a regular file, too large to be opened; see O_LARGEFILE above.
EPERM	The O_NOATIME flag was specified, but the effective user ID of the caller did not match the owner of the file and the caller was not privileged (CAP_FOWNER).
EROFS	pathname refers to a file on a read-only filesystem and write access was requested.
ETXTBSY	pathname refers to an executable image which is currently being executed and write access was requested.
EWOULDBLOCK	The O_NONBLOCK flag was specified, and an incompatible lease was held on the file (see fcntl(2)).

注意

Under Linux, the O_NONBLOCK flag indicates that one wants to open but does not necessarily have the intention to read or write. This is typically used to open devices in order to get a file descriptor for use with ioctl(2).

遵循于

SVr4, 4.3BSD, POSIX.1-2001. The O_NOATIME, O_NOFOLLOW, and O_DIRECTORYflags are Linux-specific. One may have to define the _GNU_SOURCE macro to get their definitions.

The (undefined) effect of O_RDONLY | O_TRUNC varies among implementations. On many systems the file is actually truncated.

The O_DIRECT flag was introduced in SGI IRIX, where it has alignment restrictions similar to those of Linux 2.4. IRIX has also a fcntl(2) call to query appropriate alignments, and sizes.

FreeBSD 4.x introduced a flag of same name, but without alignment restrictions. Support was added under Linux in kernel version 2.4.10. Older Linux kernels simply ignore this flag. One may have to define the _GNU_SOURCE macro to get its definition.

BUGS

"The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances." — Linus

Currently, it is not possible to enable signal-driven I/O by specifying O_ASYNC when calling open(); use fcntl(2) to enable this flag.

限制

There are many infelicities in the protocol underlying NFS, affecting amongst others O_SYNC and O_NDELAY.

POSIX provides for three different variants of synchronised I/O, corresponding to the flags O_SYNC, O_DSYNC and O_RSYNC. Currently (2.1.130) these are all synonymous under Linux.

另请参阅

dup2()函数

dup, dup2 - 复制一个文件描述符

内容简介

#include

int dup(int oldfd);
int dup2(int oldfd, int newfd);

描述

dup() 和 dup2() 创建副本文件描述符oldfd。

After a successful return from dup() or dup2(),the old and new file descriptors may be used interchangeably. They refer to the same open file description (see open(2)) and thus share file offset and file status flags; for example, if the file offset is modified by using lseek(2) on one of the descriptors, the offset is also changed for the other.

The two descriptors do not share file descriptor flags (the close-on-exec flag). The close-on-exec flag (FD_CLOEXEC; see fcntl(2)) for the duplicate descriptor is off.

dup() uses the lowest-numbered unused descriptor for the new descriptor.

dup2() makes newfd be the copy of oldfd, closing newfd first if necessary.

返回值

dup() and dup2() return the new descriptor, or -1 if an error occurred (in which case,errno is set appropriately).

错误

标签	描述
EBADF	oldfd isn’t an open file descriptor, or newfd is out of the allowed range for file descriptors.
EBUSY	(Linux only) This may be returned by dup2() during a race condition with open() and dup().
EINTR	The dup2() call was interrupted by a signal.
EMFILE	The process already has the maximum number of file descriptors open and tried to open a new one.

WARNINGS

The error returned by dup2() is different from that returned by fcntl(..., F_DUPFD, ...)when newfd is out of range. On some systems dup2() also sometimes returns EINVAL like F_DUPFD.

If newfd was open, any errors that would have been reported at close() time, are lost. A careful programmer will not use dup2() without closing newfd first.

遵循于

SVr4, 4.3BSD, POSIX.1-2001.

另请参阅

dup()函数

dup, dup2 - 复制一个文件描述符

内容简介

#include

int dup(int oldfd);
int dup2(int oldfd, int newfd);

描述

dup() 和 dup2() 创建文件描述符的副本 oldfd.

The two descriptors do not share file descriptor flags (the close-on-exec flag). The close-on-exec flag (FD_CLOEXEC; see fcntl(2)) for the duplicate descriptor is off.

dup() 使用编号最小的未用描述符的新的描述符。

dup2() 使得newfd是oldfd副本，先关闭newfd，如果必要的话。

返回值

dup() and dup2() return the new descriptor, or -1 if an error occurred (in which case,errno is set appropriately).

错误

标签	描述
EBADF	oldfd isn’t an open file descriptor, or newfd is out of the allowed range for file descriptors.
EBUSY	(Linux only) This may be returned by dup2() during a race condition with open() and dup().
EINTR	The dup2() call was interrupted by a signal.
EMFILE	The process already has the maximum number of file descriptors open and tried to open a new one.

警告

The error returned by dup2() is different from that returned by fcntl(..., F_DUPFD, ...)when newfd is out of range. On some systems dup2() also sometimes returns EINVALlike F_DUPFD.

If newfd was open, any errors that would have been reported at close() time, are lost. A careful programmer will not use dup2() without closing newfd first.

遵循于

SVr4, 4.3BSD, POSIX.1-2001.

另请参阅

epoll_create()函数

epoll_create - 打开一个epoll的文件描述符

内容简介

#include

int epoll_create(int size)

描述

Open an epoll file descriptor by requesting the kernel allocate an event backing store dimensioned for size descriptors. The size is not the maximum size of the backing store but just a hint to the kernel about how to dimension internal structures. The returned file descriptor will be used for all the subsequent calls to the epoll interface. The file descriptor returned by epoll_create(2) must be closed by using close(2).

返回值

When successful, epoll_create(2) returns a non-negative integer identifying the descriptor. When an error occurs, epoll_create(2) returns -1 and errno is set appropriately.

错误

Error Code	描述
EINVAL	size is not positive.
ENFILE	The system limit on the total number of open files has been reached.
ENOMEM	There was insufficient memory to create the kernel object.

遵循于

epoll_create(2) 在Linux内核2.5.44 推出了一个新的API。该接口应该由Linux kernel 2.5.66。

另请参阅

epoll_ctl()函数

epoll_ctl - 一个epoll的描述符的控制接口

内容简介

#include

int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event)

描述

Control an epoll descriptor, epfd, by requesting that the operation op be performed on the target file descriptor, fd. The event describes the object linked to the file descriptorfd. The struct epoll_event is defined as :

typedef union epoll_data {
void *ptr;
int fd;
__uint32_t u32;
__uint64_t u64;
} epoll_data_t;

struct epoll_event {
__uint32_t events; /* Epoll events */
epoll_data_t data; /* User data variable */
};

该事件成员是位集由使用下列可用的事件类型：

错误码

描述

EPOLLIN

The associated file is available forread(2) operations.

EPOLLOUT

The associated file is available for write(2) operations.

EPOLLRDHUP

Stream socket peer closed connection, or shut down writing half of connection. (This flag is especially useful for writing simple code to detect peer shutdown when using Edge Triggered monitoring.)

EPOLLPRI

There is urgent data available for read(2) operations.

EPOLLERR

Error condition happened on the associated file descriptor.epoll_wait(2) will always wait for this event; it is not necessary to set it in events.

EPOLLHUP

Hang up happened on the associated file descriptor.epoll_wait(2) will always wait for this event; it is not necessary to set it in events.

EPOLLET

Sets the Edge Triggered behaviour for the associated file descriptor. The default behaviour for epoll is Level Triggered. See epoll(7) for more detailed information about Edge and Level Triggered event distribution architectures.

EPOLLONESHOT(since kernel 2.6.2)

Sets the one-shot behaviour for the associated file descriptor. This means that after an event is pulled out with epoll_wait(2) the associated file descriptor is internally disabled and no other events will be reported by the epoll interface. The user must callepoll_ctl(2) with EPOLL_CTL_MOD to re-enable the file descriptor with a new event mask.

The epoll interface supports all file descriptors that support poll(2). Valid values for the op parameter are :

Code	描述
EPOLL_CTL_ADD	Add the target file descriptor fd to theepoll descriptor epfd and associate the event event with the internal file linked tofd.
EPOLL_CTL_MOD	Change the event event associated with the target file descriptor fd.
EPOLL_CTL_DEL	Remove the target file descriptor fd from the epoll file descriptor, epfd. The event is ignored and can be NULL (but see BUGS below).

返回值

When successful, epoll_ctl(2) returns zero. When an error occurs, epoll_ctl(2) returns -1 and errno is set appropriately.

错误

错误码	描述
EBADF	epfd or fd is not a valid file descriptor.
EEXIST	op was EPOLL_CTL_ADD, and the supplied file descriptor fd is already in epfd.
EINVAL	epfd is not an epoll file descriptor, or fd is the same as epfd, or the requested operation op is not supported by this interface.
ENOENT	op was EPOLL_CTL_MOD or EPOLL_CTL_DEL, and fd is not inepfd.
ENOMEM	There was insufficient memory to handle the requested opcontrol operation.
EPERM	The target file fd does not support epoll.

遵循于

epoll_ctl(2) is a new API introduced in Linux kernel 2.5.44. The interface should be finalized by Linux kernel 2.5.66.

BUG

In kernel versions before 2.6.9, the EPOLL_CTL_DEL operation required a non-NULL pointer in event, even though this argument is ignored. Since kernel 2.6.9, event can be specified as NULL when using EPOLL_CTL_DEL.

另请参阅

epoll_wait()函数

epoll_wait - 等待在 epoll 文件描述符的I/O事件

内容简介

#include

int epoll_wait(int epfd, struct epoll_event * events,
int maxevents, int timeout);

描述

Wait for events on the epoll file descriptor epfd for a maximum time of timeoutmilliseconds. The memory area pointed to by events will contain the events that will be available for the caller. Up to maxevents are returned by epoll_wait(2).

The maxevents parameter must be greater than zero. Specifying a timeout of -1 makesepoll_wait(2) wait indefinitely, while specifying a timeout equal to zero makesepoll_wait(2) to return immediately even if no events are available (return code equal to zero).

struct epoll_event 的定义如下 :

The data of each returned structure will contain the same data the user set with a epoll_ctl(2) (EPOLL_CTL_ADD,EPOLL_CTL_MOD) while the events member will contain the returned event bit field.

返回值

When successful, epoll_wait(2) returns the number of file descriptors ready for the requested I/O, or zero if no file descriptor became ready during the requested timeoutmilliseconds. When an error occurs, epoll_wait(2) returns -1 and errno is set appropriately.

错误

标签	描述
EBADF	epfd is not a valid file descriptor.
EFAULT	The memory area pointed to by events is not accessible with write permissions.
EINTR	The call was interrupted by a signal handler before any of the requested events occurred or the timeout expired.
EINVAL	epfd is not an epoll file descriptor, or maxevents is less than or equal to zero.

遵循于

epoll_wait(2) is a new API introduced in Linux kernel 2.5.44. The interface should be finalized by Linux kernel 2.5.66.

另请参阅

execve()函数

execve - 执行程序

内容简介

#include

int execve(const char *filename, char *const argv[],
char *const envp[]);

描述

execve() executes the program pointed to by filename. filename must be either a binary executable, or a script starting with a line of the form "#! interpreter [arg]". In the latter case, the interpreter must be a valid pathname for an executable which is not itself a script, which will be invoked as interpreter [arg] filename.

argv is an array of argument strings passed to the new program. envp is an array of strings, conventionally of the form key=value, which are passed as environment to the new program. Both argv and envp must be terminated by a null pointer. The argument vector and environment can be accessed by the called program’s main function, when it is defined as int main(int argc, char *argv[], char *envp[]).

execve() does not return on success, and the text, data, bss, and stack of the calling process are overwritten by that of the program loaded. The program invoked inherits the calling process’s PID, and any open file descriptors that are not set to close-on-exec. Signals pending on the calling process are cleared. Any signals set to be caught by the calling process are reset to their default behaviour. The SIGCHLD signal (when set to SIG_IGN) may or may not be reset to SIG_DFL.

If the current program is being ptraced, a SIGTRAP is sent to it after a successfulexecve().

If the set-user-ID bit is set on the program file pointed to by filename, and the calling process is not being ptraced, then the effective user ID of the calling process is changed to that of the owner of the program file. i Similarly, when the set-group-ID bit of the program file is set the effective group ID of the calling process is set to the group of the program file.

The effective user ID of the process is copied to the saved set-user-ID; similarly, the effective group ID is copied to the saved set-group-ID. This copying takes place after any effective ID changes that occur because of the set-user-ID and set-group-ID permission bits.

If the executable is an a.out dynamically-linked binary executable containing shared-library stubs, the Linux dynamic linker ld.so(8) is called at the start of execution to bring needed shared libraries into memory and link the executable with them.

If the executable is a dynamically-linked ELF executable, the interpreter named in the PT_INTERP segment is used to load the needed shared libraries. This interpreter is typically /lib/ld-linux.so.1 for binaries linked with the Linux libc version 5, or /lib/ld-linux.so.2 for binaries linked with the GNU libc version 2.

返回值

On success, execve() does not return, on error -1 is returned, and errno is set appropriately.

错误

错误码	描述
E2BIG	The total number of bytes in the environment (envp) and argument list (argv) is too large.
EACCES	Search permission is denied on a component of the path prefix of filename or the name of a script interpreter. (See alsopath_resolution(2).)
EACCES	The file or a script interpreter is not a regular file.
EACCES	Execute permission is denied for the file or a script or ELF interpreter.
EACCES	The file system is mounted noexec.
EFAULT	filename points outside your accessible address space.
EINVAL	An ELF executable had more than one PT_INTERP segment (i.e., tried to name more than one interpreter).
EIO	An I/O error occurred.
EISDIR	An ELF interpreter was a directory.
ELIBBAD	An ELF interpreter was not in a recognised format.
ELOOP	Too many symbolic links were encountered in resolving filenameor the name of a script or ELF interpreter.
EMFILE	The process has the maximum number of files open.
ENAMETOOLONG	filename is too long.
ENFILE	The system limit on the total number of open files has been reached.
ENOENT	The file filename or a script or ELF interpreter does not exist, or a shared library needed for file or interpreter cannot be found.
ENOEXEC	An executable is not in a recognised format, is for the wrong architecture, or has some other format error that means it cannot be executed.
ENOMEM	Insufficient kernel memory was available.
ENOTDIR	A component of the path prefix of filename or a script or ELF interpreter is not a directory.
EPERM	The file system is mounted nosuid, the user is not the superuser, and the file has an SUID or SGID bit set.
EPERM	The process is being traced, the user is not the superuser and the file has an SUID or SGID bit set.
ETXTBSY	Executable was open for writing by one or more processes.

遵循于

SVr4, 4.3BSD, POSIX.1-2001. POSIX.1-2001 does not document the #! 行为，但在其他方面兼容。

注意

SUID and SGID processes can not be ptrace()d. Linux ignores the SUID and SGID bits on scripts.

The result of mounting a filesystem nosuid vary between Linux kernel versions: some will refuse execution of SUID/SGID executables when this would give the user powers she did not have already (and return EPERM), some will just ignore the SUID/SGID bits and exec() successfully.

A maximum line length of 127 characters is allowed for the first line in a #! executable shell script.

历史

With Unix V6 the argument list of an exec() call was ended by 0, while the argument list of main was ended by -1. Thus, this argument list was not directly usable in a furtherexec() call. Since Unix V7 both are NULL.

另请参阅

exit_group函数

exit_group - 退出所有线程在一个进程

内容简介

#include

void exit_group(int status);

描述

This system call is equivalent to exit(2) except that it terminates not only the present thread, but all threads in the current thread group.

返回值

这个系统调用无返回。

历史

This call is present since Linux 2.5.35.

遵循于

这个调用是Linux特有的。

另请参阅

exit (2)

_exit()函数

_exit, _Exit - 终止当前进程

内容简介

#include <unistd.h>

void _exit(int status);

#include <stdlib.h>

void _Exit(int status);

描述

The function _exit() terminates the calling process "immediately". Any open file descriptors belonging to the process are closed; any children of the process are inherited by process 1, init, and the process’s parent is sent a SIGCHLD signal.

The value status is returned to the parent process as the process’s exit status, and can be collected using one of the wait() family of calls.

_Exit() 函数等同于 _exit().

返回值

些函数没有返回值

遵循于

SVr4, POSIX.1-2001, 4.3BSD. The function _Exit() was introduced by C99.

注意

For a discussion on the effects of an exit, the transmission of exit status, zombie processes, signals sent, etc., see exit(3).

The function _exit() is like exit(), but does not call any functions registered with atexit() or on_exit(). Whether it flushes standard I/O buffers and removes temporary files created with tmpfile(3) is implementation dependent. On the other hand, _exit() does close open file descriptors, and this may cause an unknown delay, waiting for pending output to finish. If the delay is undesired, it may be useful to call functions like tcflush() before calling _exit(). Whether any pending I/O is cancelled, and which pending I/O may be cancelled upon _exit(), is implementation-dependent.

另请参阅

exit()函数

_exit, _Exit - 终止当前进程

内容简介

#include

void _exit(int status);

#include

void _Exit(int status);

描述

The value status is returned to the parent process as the process’s exit status, and can be collected using one of the wait() family of calls.

The function _Exit() is equivalent to _exit().

返回值

These functions do not return.

遵循于

SVr4, POSIX.1-2001, 4.3BSD. The function _Exit() was introduced by C99.

注意

For a discussion on the effects of an exit, the transmission of exit status, zombie processes, signals sent, etc., see exit(3).

另请参阅

faccessat()函数

faccessat - 文件相对于一个目录文件描述符的更改权限

内容简介

#include

<unistd.h>

int faccessat(int dirfd, const char *path, int
mode ", int " flags );

描述

The faccessat() system call operates in exactly the same way as access(2), except for the differences described in this manual page.

If the pathname given in path is relative, then it is interpreted relative to the directory referred to by the file descriptor dirfd (rather than relative to the current working directory of the calling process, as is done by access(2) for a relative pathname).

If the pathname given in path is relative and dirfd is the special value AT_FDCWD, thenpath is interpreted relative to the current working directory of the calling process (likeaccess(2)).

If the pathname given in path is absolute, then dirfd is ignored.

flags is constructed by ORing together zero or more of the following values:

Code	描述
AT_EACCESS
	Perform access checks using the effective user and group IDs. By default, faccessat() uses the effective IDs (like access(2)).
AT_SYMLINK_NOFOLLOW
	If path is a symbolic link, do not dereference it: instead return information about the link itself.

返回值

On success, faccessat() returns 0. On error, -1 is returned and errno is set to indicate the error.

错误

The same errors that occur for access(2) can also occur for faccessat(). The following additional errors can occur for faccessat():

标签	描述
EBADF	dirfd is not a valid file descriptor.
EINVAL	Invalid flag specified in flags.
ENOTDIR	path is a relative path and dirfd is a file descriptor referring to a file other than a directory.

注意

See openat(2) for an explanation of the need for faccessat().

遵循于

这个系统调用是非标准的，但建议列入POSIX.1将来的修订版。

glibc的注意事项

The AT_EACCESS and AT_SYMLINK_NOFOLLOW flags are actually implemented within the glibc wrapper function for faccessat(). If either of these flags are specified, then the wrapper function employs fstatat(2) to determine access permissions.

版本

faccessat() 加入到Linux 的 kernel 2.6.16.

另请参阅

fattach()函数

afs_syscall, break, fattach, fdetach, ftime, getmsg, getpmsg, gtty, isastream, lock, mpx, multiplexer, prof, profil, putmsg, putpmsg, security, stty, ulimit, vserver - 未实现系统调用

内容简介

未实现系统调用

描述

未实现系统调用在 Linux 2.4 kernel.

返回值

These system calls always return -1 and set errno to ENOSYS.

注意

Note that ftime(3), profil(3) and ulimit(3) are implemented as library functions.

Some system calls, like alloc_hugepages(2), free_hugepages(2), ioperm(2), iopl(2), and vm86(2) only exist on certain architectures.

Some system calls, like ipc(2), create_module(2), init_module(2), anddelete_module(2) only exist when the Linux kernel was built with support for them.

另请参阅

obsolete (2)

fchdir()函数

chdir, fchdir - 改变工作目录

内容简介

#include

int chdir(const char *path);
int fchdir(int fd);

描述

chdir() changes the current working directory to that specified in path. fchdir() is identical to chdir(); the only difference is that the directory is given as an open file descriptor.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

Depending on the file system, other errors can be returned. The more general errors for chdir() are listed below:

Error Code	描述
EACCES	Search permission is denied for one of the directories in the path prefix of path. (See also path_resolution(2).)
EFAULT	path points outside your accessible address space.
EIO	An I/O error occurred.
ELOOP	Too many symbolic links were encountered in resolving path.
ENAMETOOLONG	path is too long.
ENOENT	The file does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOTDIR	A component of path is not a directory.
The general errors for fchdir() are listed below:
EACCES	Search permission was denied on the directory open on fd.
EBADF	fd is not a valid file descriptor.

注意

A child process created via fork(2) inherits its parent’s current working directory. The current working directory is left unchanged by execve(2).

The prototype for fchdir() is only available if _BSD_SOURCE is defined, or_XOPEN_SOURCE is defined with the value 500.

遵循于

SVr4, 4.4BSD, POSIX.1-2001.

另请参阅

fchmodat()函数

fchmodat - 文件相对于一个目录文件描述符的更改权限

内容简介

#include

int fchmodat(int dirfd, const char *path, mode_t
mode ", int " flags );

描述

The fchmodat() system call operates in exactly the same way as chmod(2), except for the differences described in this manual page.

If the pathname given in path is relative and dirfd is the special value AT_FDCWD, thenpath is interpreted relative to the current working directory of the calling process (likechmod(2)).

If the pathname given in path is absolute, then dirfd is ignored.

flags can either be 0, or include the following flag:

标签	描述
AT_SYMLINK_NOFOLLOW	If path is a symbolic link, do not dereference it: instead operate on the link itself. This flag is not currently implemented.

返回值

On success, fchmodat() returns 0. On error, -1 is returned and errno is set to indicate the error.

错误

The same errors that occur for chmod(2) can also occur for fchmodat(). The following additional errors can occur for fchmodat():

标签	描述
EBADF	dirfd is not a valid file descriptor.
EINVAL	Invalid flag specified in flags.
ENOTDIR	path is a relative path and dirfd is a file descriptor referring to a file other than a directory.
ENOTSUP	flags specified AT_SYMLINK_NOFOLLOW, which is not supported.

注意

See openat(2) for an explanation of the need for fchmodat().

遵循于

This system call is non-standard but is proposed for inclusion in a future revision of POSIX.1.

VERSIONS

fchmodat() was added to Linux in kernel 2.6.16.

另请参阅

fchmod()函数

chmod, fchmod - 修改一个文件权限

内容简介

#include <sys/types.h>

#include <sys/stat.h>

int chmod(const char *
path
, mode_t
mode
);

int fchmod(int
fildes
, mode_t
mode
);

描述

The mode of the file given by path or referenced by fildes is changed.

Modes are specified by or’ing the following:

Mode	描述
S_ISUID	04000 set user ID on execution
S_ISGID	02000 set group ID on execution
S_ISVTX	01000 sticky bit
S_IRUSR	00400 read by owner
S_IWUSR	00200 write by owner
S_IXUSR	00100 execute/search by owner
S_IRGRP	00040 read by group
S_IWGRP	00020 write by group
S_IXGRP	00010 execute/search by group
S_IROTH	00004 read by others
S_IWOTH	00002 write by others
S_IXOTH	00001 execute/search by others

The effective UID of the calling process must match the owner of the file, or the process must be privileged (Linux: it must have the CAP_FOWNER capability).

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

Depending on the file system, other errors can be returned. The more general errors forchmod() are listed below:

Error Code	描述
EACCES	Search permission is denied on a component of the path prefix. (See also path_resolution(2).)
EFAULT	path points outside your accessible address space.
EIO	An I/O error occurred.
ELOOP	Too many symbolic links were encountered in resolving path.
ENAMETOOLONG	path is too long.
ENOENT	The file does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOTDIR	A component of the path prefix is not a directory.
EPERM	The effective UID does not match the owner of the file, and the process is not privileged (Linux: it does not have theCAP_FOWNER capability).
EROFS	The named file resides on a read-only file system.
The general errors for fchmod() are listed below:
EBADF	The file descriptor fildes is not valid.
EIO	See above.
EPERM	See above.
EROFS	See above.

遵循于

4.4BSD, SVr4, POSIX.1-2001.

另请参阅

fchownat()函数

fchownat - 改变文件的一个相对的所有权到一个目录文件描述符

内容简介

#include

<unistd.h>

int fchownat(int dirfd, const char *path,
uid_t owner, gid_t group, int flags);

描述

The fchownat() system call operates in exactly the same way as chown(2), except for the differences described in this manual page.

If the pathname given in path is relative and dirfd is the special value AT_FDCWD, thenpath is interpreted relative to the current working directory of the calling process (likechown(2)).

If the pathname given in path is absolute, then dirfd is ignored.

flags can either be 0, or include the following flag:

标签	描述
AT_SYMLINK_NOFOLLOW	If path is a symbolic link, do not dereference it: instead operate on the link itself, like lchown(2). (By default,fchownat() dereferences symbolic links, like chown(2).)

返回值

On success, fchownat() returns 0. On error, -1 is returned and errno is set to indicate the error.

错误

The same errors that occur for chown(2) can also occur for fchownat(). The following additional errors can occur for fchownat():

标签	描述
EBADF	dirfd is not a valid file descriptor.
EINVAL	Invalid flag specified in flags.
ENOTDIR	path is a relative path and dirfd is a file descriptor referring to a file other than a directory.

注意

See openat(2) for an explanation of the need for fchownat().

遵循于

This system call is non-standard but is proposed for inclusion in a future revision of POSIX.1. A similar system call exists on Solaris.

VERSIONS

fchownat() was added to Linux in kernel 2.6.16.

另请参阅

fchown()函数

chown, fchown, lchown - 更改文件的所有权

内容简介

#include <sys/types.h>

#include <unistd.h>

int chown(const char *
path
, uid_t
owner
, gid_t
group
);

int fchown(int
fd
, uid_t
owner
, gid_t
group
);

int lchown(const char *
path
, uid_t
owner
, gid_t
group
);

描述

If the owner or group is specified as -1, then that ID is not changed.

When the owner or group of an executable file are changed by a non-superuser, the S_ISUID and S_ISGID mode bits are cleared. POSIX does not specify whether this also should happen when root does the chown(); the Linux behaviour depends on the kernel version. In case of a non-group-executable file (with clear S_IXGRP bit) the S_ISGID bit indicates mandatory locking, and is not cleared by a chown().

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

Depending on the file system, other errors can be returned. The more general errors forchown() are listed below.

标签	描述
EACCES	Search permission is denied on a component of the path prefix. (See also path_resolution(2).)
EFAULT	path points outside your accessible address space.
ELOOP	Too many symbolic links were encountered in resolving path.
ENAMETOOLONG	path is too long.
ENOENT	The file does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOTDIR	A component of the path prefix is not a directory.
EPERM	The calling process did not have the required permissions (see above) to change owner and/or group.
EROFS	The named file resides on a read-only file system.
The general errors for fchown() are listed below:
EBADF	The descriptor is not valid.
EIO	A low-level I/O error occurred while modifying the inode.
ENOENT	See above.
EPERM	See above.
EROFS	See above.

注意

The prototype for fchown() is only available if _BSD_SOURCE is defined.

遵循于

4.4BSD, SVr4, POSIX.1-2001.

The 4.4BSD version can only be used by the superuser (that is, ordinary users cannot give away files).

限制

另请参阅

fcntl()函数

fcntl - 操作文件描述符

内容简介

#include <unistd.h>

#include <fcntl.h>

int fcntl(int
fd
, int
cmd
);

int fcntl(int
fd
, int
cmd
, long
arg
);

int fcntl(int
fd
, int
cmd
, struct flock *
lock
);

描述

fcntl() 执行下述就开文件描述符fd的操作之一。该操作是由 cmd 确定。

复制一个文件描述符

标签

描述

F_DUPFD

Find the lowest numbered available file descriptor greater than or equal to arg and make it be a copy of fd. This is different fromdup2(2) which uses exactly the descriptor specified.

On success, the new descriptor is returned.

See dup(2) for further details.

文件描述符标志

The following commands manipulate the flags associated with a file descriptor. Currently, only one such flag is defined: FD_CLOEXEC, the close-on-exec flag. If theFD_CLOEXEC bit is 0, the file descriptor will remain open across an execve(2), otherwise it will be closed.

标签	描述
F_GETFD	Read the file descriptor flags.
F_SETFD	Set the file descriptor flags to the value specified by arg.

文件状态标志

Each open file description has certain associated status flags, initialized by open(2) and possibly modified by fcntl(2). Duplicated file descriptors (made with dup(),fcntl(F_DUPFD), fork(), etc.) refer to the same open file description, and thus share the same file status flags.

The file status flags and their semantics are described in open(2).

标签	描述
F_GETFL	Read the file status flags.
F_SETFL	Set the file status flags to the value specified by arg. File access mode (O_RDONLY, O_WRONLY, O_RDWR) and file creation flags (i.e., O_CREAT, O_EXCL, O_NOCTTY, O_TRUNC) in argare ignored. On Linux this command can only change theO_APPEND, O_ASYNC, O_DIRECT, O_NOATIME, andO_NONBLOCK flags.

咨询锁

F_GETLK, F_SETLK and F_SETLKW are used to acquire, release, and test for the existence of record locks (also known as file-segment or file-region locks). The third argument lock is a pointer to a structure that has at least the following fields (in unspecified order).

struct flock {
...
short l_type; /* Type of lock: F_RDLCK,
F_WRLCK, F_UNLCK */
short l_whence; /* How to interpret l_start:
SEEK_SET, SEEK_CUR, SEEK_END */
off_t l_start; /* Starting offset for lock */
off_t l_len; /* Number of bytes to lock */
pid_t l_pid; /* PID of process blocking our lock
(F_GETLK only) */
...
};

The l_whence, l_start, and l_len fields of this structure specify the range of bytes we wish to lock. l_start is the starting offset for the lock, and is interpreted relative to either: the start of the file (if l_whence is SEEK_SET); the current file offset (if l_whenceis SEEK_CUR); or the end of the file (if l_whence is SEEK_END). In the final two cases,l_start can be a negative number provided the offset does not lie before the start of the file. l_len is a non-negative integer (but see the NOTES below) specifying the number of bytes to be locked. Bytes past the end of the file may be locked, but not bytes before the start of the file. Specifying 0 for l_len has the special meaning: lock all bytes starting at the location specified by l_whence and l_start through to the end of file, no matter how large the file grows.

The l_type field can be used to place a read (F_RDLCK) or a write (F_WRLCK) lock on a file. Any number of processes may hold a read lock (shared lock) on a file region, but only one process may hold a write lock (exclusive lock). An exclusive lock excludes all other locks, both shared and exclusive. A single process can hold only one type of lock on a file region; if a new lock is applied to an already-locked region, then the existing lock is converted to the new lock type. (Such conversions may involve splitting, shrinking, or coalescing with an existing lock if the byte range specified by the new lock does not precisely coincide with the range of the existing lock.)

标签	描述
F_SETLK	Acquire a lock (when l_type is F_RDLCK or F_WRLCK) or release a lock (when l_type is F_UNLCK) on the bytes specified by the l_whence, l_start, and l_len fields of lock. If a conflicting lock is held by another process, this call returns -1 and setserrno to EACCES or EAGAIN.
F_SETLKW	As for F_SETLK, but if a conflicting lock is held on the file, then wait for that lock to be released. If a signal is caught while waiting, then the call is interrupted and (after the signal handler has returned) returns immediately (with return value -1 anderrno set to EINTR).
F_GETLK	On input to this call, lock describes a lock we would like to place on the file. If the lock could be placed, fcntl() does not actually place it, but returns F_UNLCK in the l_type field of lock and leaves the other fields of the structure unchanged. If one or more incompatible locks would prevent this lock being placed, then fcntl() returns details about one of these locks in thel_type, l_whence, l_start, and l_len fields of lock and sets l_pid to be the PID of the process holding that lock.

In order to place a read lock, fd must be open for reading. In order to place a write lock,fd must be open for writing. To place both types of lock, open a file read-write.

As well as being removed by an explicit F_UNLCK, record locks are automatically released when the process terminates or if it closes any file descriptor referring to a file on which locks are held. This is bad: it means that a process can lose the locks on a file like /etc/passwd or /etc/mtab when for some reason a library function decides to open, read and close it.

Record locks are not inherited by a child created via fork(2), but are preserved across an execve(2).

Because of the buffering performed by the stdio(3) library, the use of record locking with routines in that package should be avoided; use read(2) and write(2) instead.

强制锁

(Non-POSIX.) The above record locks may be either advisory or mandatory, and are advisory by default.

Advisory locks are not enforced and are useful only between cooperating processes.

Mandatory locks are enforced for all processes. If a process tries to perform an incompatible access (e.g., read(2) or write(2)) on a file region that has an incompatible mandatory lock, then the result depends upon whether the O_NONBLOCK flag is enabled for its open file description. If the O_NONBLOCK flag is not enabled, then system call is blocked until the lock is removed or converted to a mode that is compatible with the access. If the O_NONBLOCK flag is enabled, then the system call fails with the error EAGAIN or EWOULDBLOCK.

To make use of mandatory locks, mandatory locking must be enabled both on the file system that contains the file to be locked, and on the file itself. Mandatory locking is enabled on a file system using the "-o mand" option to mount(8), or theMS_MANDLOCK flag for mount(2). Mandatory locking is enabled on a file by disabling group execute permission on the file and enabling the set-group-ID permission bit (seechmod(1) and chmod(2)).

管理信号

F_GETOWN, F_SETOWN, F_GETSIG and F_SETSIG are used to manage I/O availability signals:

标签	描述
F_GETOWN	Get the process ID or process group currently receiving SIGIO and SIGURG signals for events on file descriptor fd. Process IDs are returned as positive values; process group IDs are returned as negative values (but see BUGS below).
F_SETOWN	Set the process ID or process group ID that will receive SIGIO and SIGURG signals for events on file descriptor fd. A process ID is specified as a positive value; a process group ID is specified as a negative value. Most commonly, the calling process specifies itself as the owner (that is, arg is specified asgetpid()). If you set the O_ASYNC status flag on a file descriptor (either by providing this flag with the open(2) call, or by using theF_SETFL command of fcntl()), a SIGIO signal is sent whenever input or output becomes possible on that file descriptor.F_SETSIG can be used to obtain delivery of a signal other than SIGIO. If this permission check fails, then the signal is silently discarded. Sending a signal to the owner process (group) specified byF_SETOWN is subject to the same permissions checks as are described for kill(2), where the sending process is the one that employs F_SETOWN (but see BUGS below). If the file descriptor fd refers to a socket, F_SETOWN also selects the recipient of SIGURG signals that are delivered when out-of-band data arrives on that socket. (SIGURG is sent in any situation where select(2) would report the socket as having an "exceptional condition".) If a non-zero value is given to F_SETSIG in a multi-threaded process running with a threading library that supports thread groups (e.g., NPTL), then a positive value given to F_SETOWNhas a different meaning: instead of being a process ID identifying a whole process, it is a thread ID identifying a specific thread within a process. Consequently, it may be necessary to pass F_SETOWN the result of gettid() instead ofgetpid() to get sensible results when F_SETSIG is used. (In current Linux threading implementations, a main thread’s thread ID is the same as its process ID. This means that a single-threaded program can equally use gettid() or getpid() in this scenario.) Note, however, that the statements in this paragraph do not apply to the SIGURG signal generated for out-of-band data on a socket: this signal is always sent to either a process or a process group, depending on the value given toF_SETOWN. Note also that Linux imposes a limit on the number of real-time signals that may be queued to a process (seegetrlimit(2) and signal(7)) and if this limit is reached, then the kernel reverts to delivering SIGIO, and this signal is delivered to the entire process rather than to a specific thread.
F_GETSIG	Get the signal sent when input or output becomes possible. A value of zero means SIGIO is sent. Any other value (including SIGIO) is the signal sent instead, and in this case additional info is available to the signal handler if installed with SA_SIGINFO.
F_SETSIG	Sets the signal sent when input or output becomes possible. A value of zero means to send the default SIGIO signal. Any other value (including SIGIO) is the signal to send instead, and in this case additional info is available to the signal handler if installed with SA_SIGINFO. Additionally, passing a non-zero value to F_SETSIG changes the signal recipient from a whole process to a specific thread within a process. See the description of F_SETOWN for more details. By using F_SETSIG with a non-zero value, and setting SA_SIGINFO for the signal handler (see sigaction(2)), extra information about I/O events is passed to the handler in asiginfo_t structure. If the si_code field indicates the source is SI_SIGIO, the si_fd field gives the file descriptor associated with the event. Otherwise, there is no indication which file descriptors are pending, and you should use the usual mechanisms (select(2), poll(2), read(2) with O_NONBLOCK set etc.) to determine which file descriptors are available for I/O. By selecting a real time signal (value >= SIGRTMIN), multiple I/O events may be queued using the same signal numbers. (Queuing is dependent on available memory). Extra information is available if SA_SIGINFO is set for the signal handler, as above.

Using these mechanisms, a program can implement fully asynchronous I/O without using select(2) or poll(2) most of the time.

The use of O_ASYNC, F_GETOWN, F_SETOWN is specific to BSD and Linux. F_GETSIGand F_SETSIG are Linux-specific. POSIX has asynchronous I/O and the aio_sigeventstructure to achieve similar things; these are also available in Linux as part of the GNU C Library (Glibc).

租约

F_SETLEASE and F_GETLEASE (Linux 2.4 onwards) are used (respectively) to establish and retrieve the current setting of the calling process’s lease on the file referred to byfd. A file lease provides a mechanism whereby the process holding the lease (the "lease holder") is notified (via delivery of a signal) when a process (the "lease breaker") tries to open(2) or truncate(2) that file.

标签

描述

F_SETLEASE

Set or remove a file lease according to which of the following values is specified in the integer arg:

标签	描述
F_RDLCK	Take out a read lease. This will cause the calling process to be notified when the file is opened for writing or is truncated. A read lease can only be placed on a file descriptor that is opened read-only.
F_WRLCK	Take out a write lease. This will cause the caller to be notified when the file is opened for reading or writing or is truncated. A write lease may be placed on a file only if no other process currently has the file open.
F_UNLCK	Remove our lease from the file.

A process may hold only one type of lease on a file.

Leases may only be taken out on regular files. An unprivileged process may only take out a lease on a file whose UID matches the file system UID of the process. A process with the CAP_LEASE capability may take out leases on arbitrary files.

F_GETLEASE

Indicates what type of lease we hold on the file referred to by fdby returning either F_RDLCK, F_WRLCK, or F_UNLCK,indicating, respectively, that the calling process holds a read, a write, or no lease on the file. (The third argument to fcntl() is omitted.)

When a process (the "lease breaker") performs an open() or truncate() that conflicts with a lease established via F_SETLEASE, the system call is blocked by the kernel and the kernel notifies the lease holder by sending it a signal (SIGIO by default). The lease holder should respond to receipt of this signal by doing whatever cleanup is required in preparation for the file to be accessed by another process (e.g., flushing cached buffers) and then either remove or downgrade its lease. A lease is removed by performing anF_SETLEASE command specifying arg as F_UNLCK. If we currently hold a write lease on the file, and the lease breaker is opening the file for reading, then it is sufficient to downgrade the lease to a read lease. This is done by performing an F_SETLEASEcommand specifying arg as F_RDLCK.

If the lease holder fails to downgrade or remove the lease within the number of seconds specified in /proc/sys/fs/lease-break-time then the kernel forcibly removes or downgrades the lease holder’s lease.

Once the lease has been voluntarily or forcibly removed or downgraded, and assuming the lease breaker has not unblocked its system call, the kernel permits the lease breaker’s system call to proceed.

If the lease breaker’s blocked open() or truncate() is interrupted by a signal handler, then the system call fails with the error EINTR, but the other steps still occur as described above. If the lease breaker is killed by a signal while blocked in open() ortruncate(), then the other steps still occur as described above. If the lease breaker specifies the O_NONBLOCK flag when calling open(), then the call immediately fails with the error EWOULDBLOCK, but the other steps still occur as described above.

The default signal used to notify the lease holder is SIGIO, but this can be changed using the F_SETSIG command to fcntl(). If a F_SETSIG command is performed (even one specifying SIGIO), and the signal handler is established using SA_SIGINFO, then the handler will receive a siginfo_t structure as its second argument, and the si_fd field of this argument will hold the descriptor of the leased file that has been accessed by another process. (This is useful if the caller holds leases against multiple files).

文件和目录更改通知

标签

描述

F_NOTIFY

(Linux 2.4 onwards) Provide notification when the directory referred to by fd or any of the files that it contains is changed. The events to be notified are specified in arg, which is a bit mask specified by ORing together zero or more of the following bits:

Bit	Description (event in directory)
DN_MODIFY	A file was modified (write, pwrite,
	writev, truncate, ftruncate)
DN_CREATE	A file was created (open, creat, mknod,
	mkdir, link, symlink, rename)
DN_DELETE	A file was unlinked (unlink, rename to
	another directory, rmdir)
DN_RENAME	A file was renamed within this
	directory (rename)
DN_ATTRIB	The attributes of a file were changed
	(chown, chmod, utime[s])

(In order to obtain these definitions, the _GNU_SOURCE feature test macro must be defined.)

Directory notifications are normally "one-shot", and the application must re-register to receive further notifications. Alternatively, if DN_MULTISHOT is included in arg, then notification will remain in effect until explicitly removed.v

A series of F_NOTIFY requests is cumulative, with the events inarg being added to the set already monitored. To disable notification of all events, make an F_NOTIFY call specifying argas 0.

Notification occurs via delivery of a signal. The default signal is SIGIO, but this can be changed using the F_SETSIG command to fcntl(). In the latter case, the signal handler receives asiginfo_t structure as its second argument (if the handler was established using SA_SIGINFO) and the si_fd field of this structure contains the file descriptor which generated the notification (useful when establishing notification on multiple directories).

Especially when using DN_MULTISHOT, a real time signal should be used for notification, so that multiple notifications can be queued.

NOTE: New applications should consider using the inotifyinterface (available since kernel 2.6.13), which provides a superior interface for obtaining notifications of file system events. See inotify(7).

返回值

对于一个成功的调用，返回值取决于操作：

标签	描述
F_DUPFD	The new descriptor.
F_GETFD	Value of flags.
F_GETFL	Value of flags.
F_GETOWN	Value of descriptor owner.
F_GETSIG	Value of signal sent when read or write becomes possible, or zero for traditional SIGIO behaviour.
All other commands
	Zero.

On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EACCES or EAGAIN	Operation is prohibited by locks held by other processes.
EAGAIN	The operation is prohibited because the file has been memory-mapped by another process.
EBADF	fd is not an open file descriptor, or the command was F_SETLKor F_SETLKW and the file descriptor open mode doesn’t match with the type of lock requested.
EDEADLK	It was detected that the specified F_SETLKW command would cause a deadlock.
EFAULT	lock is outside your accessible address space.
EINTR	For F_SETLKW, the command was interrupted by a signal. ForF_GETLK and F_SETLK, the command was interrupted by a signal before the lock was checked or acquired. Most likely when locking a remote file (e.g. locking over NFS), but can sometimes happen locally.
EINVAL	For F_DUPFD, arg is negative or is greater than the maximum allowable value. For F_SETSIG, arg is not an allowable signal number.
EMFILE	For F_DUPFD, the process already has the maximum number of file descriptors open.
ENOLCK	Too many segment locks open, lock table is full, or a remote locking protocol failed (e.g. locking over NFS).
EPERM	Attempted to clear the O_APPEND flag on a file that has the append-only attribute set.

注意

The errors returned by dup2() are different from those returned by F_DUPFD.

Since kernel 2.0, there is no interaction between the types of lock placed by flock(2) and fcntl(2).

POSIX.1-2001 allows l_len to be negative. (And if it is, the interval described by the lock covers bytes l_start+l_len up to and including l_start-1.) This is supported by Linux since Linux 2.4.21 and 2.5.49.

Several systems have more fields in struct flock such as e.g. l_sysid. Clearly, l_pid alone is not going to be very useful if the process holding the lock may live on a different machine.

BUGS

A limitation of the Linux system call conventions on some architectures (notably x86) means that if a (negative) process group ID to be returned by F_GETOWN falls in the range -1 to -4095, then the return value is wrongly interpreted by glibc as an error in the system call; that is, the return value of fcntl() will be -1, and errno will contain the (positive) process group ID.

In Linux 2.4 and earlier, there is bug that can occur when an unprivileged process usesF_SETOWN to specify the owner of a socket file descriptor as a process (group) other than the caller. In this case, fcntl() can return -1 with errno set to EPERM, even when the owner process (group) is one that the caller has permission to send signals to. Despite this error return, the file descriptor owner is set, and signals will be sent to the owner.

遵循于

SVr4, 4.3BSD, POSIX.1-2001. Only the operations F_DUPFD, F_GETFD, F_SETFD, F_GETFL, F_SETFL, F_GETLK, F_SETLK, F_SETLKW, F_GETOWN, and F_SETOWN are specified in POSIX.1-2001.

F_GETSIG, F_SETSIG, F_NOTIFY, F_GETLEASE, and F_SETLEASE are Linux specific. (Define the _GNU_SOURCE macro to obtain these definitions.)

另请参阅

fdatasync()函数

fdatasync - 同步的核心与该数据在磁盘上的文件

内容简介

#include <unistd.h>

int fdatasync(int
fd
);

描述

fdatasync() flushes all data buffers of a file to disk (before the system call returns). It resembles fsync() but is not required to update the metadata such as access time.

Applications that access databases or log files often write a tiny data fragment (e.g., one line in a log file) and then call fsync() immediately in order to ensure that the written data is physically stored on the harddisk. Unfortunately, fsync() will always initiate two write operations: one for the newly written data and another one in order to update the modification time stored in the inode.

If the modification time is not a part of the transaction concept fdatasync() can be used to avoid unnecessary inode disk write operations.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

Error Code	描述
EBADF	fd is not a valid file descriptor open for writing.
EIO	An error occurred during synchronization.
EROFS, EINVAL	fd is bound to a special file which does not support synchronization.

BUGS

Currently (Linux 2.2) fdatasync() is equivalent to fsync().

可用性

On POSIX systems on which fdatasync() is available, _POSIX_SYNCHRONIZED_IO is defined in <unistd.h> to a value greater than 0. (See also sysconf(3).)

遵循于

POSIX.1-2001.

另请参阅

fdetach()函数

afs_syscall, break, fattach, fdetach, ftime, getmsg, getpmsg, gtty, isastream, lock, mpx, multiplexer, prof, profil, putmsg, putpmsg, security, stty, ulimit, vserver - 未实现系统调用

内容简介

未实现系统调用。

描述

这些系统调用未实现在 Linux 2.4 kernel.

返回值

These system calls always return -1 and set errno to ENOSYS.

注意

Note that ftime(3), profil(3) and ulimit(3) are implemented as library functions.

Some system calls, like alloc_hugepages(2), free_hugepages(2), ioperm(2), iopl(2), and vm86(2) only exist on certain architectures.

Some system calls, like ipc(2), create_module(2), init_module(2), anddelete_module(2) only exist when the Linux kernel was built with support for them.

另请参阅

obsolete (2)

flock()函数

flock - 应用或删除上一个打开的文件的咨询锁

内容简介

#include <sys/file.h>

int flock(int
fd
, int
operation
);

描述

应用或删除由 fd 所指定的打开文件的咨询锁。参数操作是执行下列操作之一：

标签	描述
LOCK_SH	Place a shared lock. More than one process may hold a shared lock for a given file at a given time.
LOCK_EX	Place an exclusive lock. Only one process may hold an exclusive lock for a given file at a given time.
LOCK_UN	Remove an existing lock held by this process.

A call to flock() may block if an incompatible lock is held by another process. To make a non-blocking request, include LOCK_NB (by ORing) with any of the above operations.

A single file may not simultaneously have both shared and exclusive locks.

Locks created by flock() are associated with an open file table entry. This means that duplicate file descriptors (created by, for example, fork(2) or dup(2)) refer to the same lock, and this lock may be modified or released using any of these descriptors. Furthermore, the lock is released either by an explicit LOCK_UN operation on any of these duplicate descriptors, or when all such descriptors have been closed.

If a process uses open(2) (or similar) to obtain more than one descriptor for the same file, these descriptors are treated independently by flock(). An attempt to lock the file using one of these file descriptors may be denied by a lock that the calling process has already placed via another descriptor.

A process may only hold one type of lock (shared or exclusive) on a file. Subsequentflock() calls on an already locked file will convert an existing lock to the new lock mode.

Locks created by flock() are preserved across an execve(2).

A shared or exclusive lock can be placed on a file regardless of the mode in which the file was opened.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

Error Code	描述
EBADF	fd is not a not an open file descriptor.
EINTR	While waiting to acquire a lock, the call was interrupted by delivery of a signal caught by a handler.
EINVAL	operation is invalid.
ENOLCK	The kernel ran out of memory for allocating lock records.
EWOULDBLOCK	The file is locked and the LOCK_NB flag was selected.

遵循于

4.4BSD (the flock(2) call first appeared in 4.2BSD). A version of flock(2), possibly implemented in terms of fcntl(2), appears on most Unices.

注意

flock(2) does not lock files over NFS. Use fcntl(2) instead: that does work over NFS, given a sufficiently recent version of Linux and a server which supports locking.

Since kernel 2.0, flock(2) is implemented as a system call in its own right rather than being emulated in the GNU C library as a call to fcntl(2). This yields true BSD semantics: there is no interaction between the types of lock placed by flock(2) and fcntl(2), andflock(2) does not detect deadlock.

flock(2) places advisory locks only; given suitable permissions on a file, a process is free to ignore the use of flock(2) and perform I/O on the file.

flock(2) and fcntl(2) locks have different semantics with respect to forked processes and dup(2). On systems that implement flock() using fcntl(), the semantics of flock() will be different from those described in this manual page.

Converting a lock (shared to exclusive, or vice versa) is not guaranteed to be atomic: the existing lock is first removed, and then a new lock is established. Between these two steps, a pending lock request by another process may be granted, with the result that the conversion either blocks, or fails if LOCK_NB was specified. (This is the original BSD behaviour, and occurs on many other implementations.)

另请参阅

fork()函数

fork - 创建一个子进程

内容简介

#include <sys/types.h>

#include <unistd.h>
pid_t fork(void);

描述

fork() creates a child process that differs from the parent process only in its PID and PPID, and in the fact that resource utilizations are set to 0. File locks and pending signals are not inherited.

Under Linux, fork() is implemented using copy-on-write pages, so the only penalty that it incurs is the time and memory required to duplicate the parent’s page tables, and to create a unique task structure for the child.

返回值

On success, the PID of the child process is returned in the parent’s thread of execution, and a 0 is returned in the child’s thread of execution. On failure, a -1 will be returned in the parent’s context, no child process will be created, and errno will be set appropriately.

错误

错误码	描述
EAGAIN	fork() cannot allocate sufficient memory to copy the parent’s page tables and allocate a task structure for the child.
EAGAIN	It was not possible to create a new process because the caller’sRLIMIT_NPROC resource limit was encountered. To exceed this limit, the process must have either the CAP_SYS_ADMIN or the CAP_SYS_RESOURCE capability.
ENOMEM	fork() failed to allocate the necessary kernel structures because memory is tight.

遵循于

SVr4, 4.3BSD, POSIX.1-2001.

另请参阅

alloc_hugepages()函数

alloc_hugepages, free_hugepages - 分配或释放巨大的页面

内容简介

void *alloc_hugepages(int
key
, void *
addr
, size_t
len
,

int
prot
, int
flag
);

int free_hugepages(void *
addr
);

描述

The len parameter is the length of the required segment. It must be a multiple of the huge page size.

The prot parameter specifies the memory protection of the segment. It is one of PROT_READ, PROT_WRITE, PROT_EXEC.

返回值

On success, alloc_hugepages() returns the allocated virtual address, andfree_hugepages() returns zero. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
ENOSYS	The system call is not supported on this kernel.

遵循于

文件

/proc/sys/vm/nr_hugepages Number of configured hugetlb pages. This can be read and written.

/proc/meminfo Gives info on the number of configured hugetlb pages and on their size in the three variables HugePages_Total, HugePages_Free, Hugepagesize.

注意

The maximal number of huge pages can be specified using the hugepages= boot parameter.

fstatat()函数

fstatat - 得到相对文件的状态到一个目录文件描述符

内容简介

#include <sys/stat.h>

int fstatat(int
dirfd
, const char *
path
, struct stat *
buf ", int " flags );

描述

The fstatat() system call operates in exactly the same way as stat(2), except for the differences described in this manual page.

If the pathname given in path is relative and dirfd is the special value AT_FDCWD, thenpath is interpreted relative to the current working directory of the calling process (likestat(2)).

If the pathname given in path is absolute, then dirfd is ignored.

flags can either be 0, or include the following flag:

标签	描述
AT_SYMLINK_NOFOLLOW	If path is a symbolic link, do not dereference it: instead return information about the link itself, like lstat(2). (By default, fstatat() dereferences symbolic links, likestat(2).)

返回值

On success, fstatat() returns 0. On error, -1 is returned and errno is set to indicate the error.

错误

The same errors that occur for stat(2) can also occur for fstatat(). The following additional errors can occur for fstatat():

标签	描述
EBADF	dirfd is not a valid file descriptor.
EINVAL	Invalid flag specified in flags.
ENOTDIR	path is a relative path and dirfd is a file descriptor referring to a file other than a directory.

注意

See openat(2) for an explanation of the need for fstatat().

遵循于

This system call is non-standard but is proposed for inclusion in a future revision of POSIX.1. A similar system call exists on Solaris.

版本

fstatat() was added to Linux in kernel 2.6.16.

另请参阅

statfs()函数

statfs, fstatfs - 获取文件系统统计信息

内容简介

#include <sys/vfs.h>
/* or <sys/statfs.h> */

int statfs(const char *
path
, struct statfs *
buf
);

int fstatfs(int
fd
, struct statfs *
buf
);

描述

The function statfs() returns information about a mounted file system. path is the pathname of any file within the mounted filesystem. buf is a pointer to a statfs structure defined approximately as follows:

struct statfs {
long f_type; /* type of filesystem (see below) */
long f_bsize; /* optimal transfer block size */
long f_blocks; /* total data blocks in file system */
long f_bfree; /* free blocks in fs */
long f_bavail; /* free blocks avail to non-superuser */
long f_files; /* total file nodes in file system */
long f_ffree; /* free file nodes in fs */
fsid_t f_fsid; /* file system id */
long f_namelen; /* maximum length of filenames */
};

文件系统类型：

ADFS_SUPER_MAGIC 0xadf5
AFFS_SUPER_MAGIC 0xADFF
BEFS_SUPER_MAGIC 0x42465331
BFS_MAGIC 0x1BADFACE
CIFS_MAGIC_NUMBER 0xFF534D42
CODA_SUPER_MAGIC 0x73757245
COH_SUPER_MAGIC 0x012FF7B7
CRAMFS_MAGIC 0x28cd3d45
DEVFS_SUPER_MAGIC 0x1373
EFS_SUPER_MAGIC 0x00414A53
EXT_SUPER_MAGIC 0x137D
EXT2_OLD_SUPER_MAGIC 0xEF51
EXT2_SUPER_MAGIC 0xEF53
EXT3_SUPER_MAGIC 0xEF53
HFS_SUPER_MAGIC 0x4244
HPFS_SUPER_MAGIC 0xF995E849
HUGETLBFS_MAGIC 0x958458f6
ISOFS_SUPER_MAGIC 0x9660
JFFS2_SUPER_MAGIC 0x72b6
JFS_SUPER_MAGIC 0x3153464a
MINIX_SUPER_MAGIC 0x137F /* orig. minix */
MINIX_SUPER_MAGIC2 0x138F /* 30 char minix */
MINIX2_SUPER_MAGIC 0x2468 /* minix V2 */
MINIX2_SUPER_MAGIC2 0x2478 /* minix V2, 30 char names */
MSDOS_SUPER_MAGIC 0x4d44
NCP_SUPER_MAGIC 0x564c
NFS_SUPER_MAGIC 0x6969
NTFS_SB_MAGIC 0x5346544e
OPENPROM_SUPER_MAGIC 0x9fa1
PROC_SUPER_MAGIC 0x9fa0
QNX4_SUPER_MAGIC 0x002f
REISERFS_SUPER_MAGIC 0x52654973
ROMFS_MAGIC 0x7275
SMB_SUPER_MAGIC 0x517B
SYSV2_SUPER_MAGIC 0x012FF7B6
SYSV4_SUPER_MAGIC 0x012FF7B5
TMPFS_MAGIC 0x01021994
UDF_SUPER_MAGIC 0x15013346
UFS_MAGIC 0x00011954
USBDEVICE_SUPER_MAGIC 0x9fa2
VXFS_SUPER_MAGIC 0xa501FCF5
XENIX_SUPER_MAGIC 0x012FF7B4
XFS_SUPER_MAGIC 0x58465342
_XIAFS_SUPER_MAGIC 0x012FD16D

Nobody knows what f_fsid is supposed to contain (but see below).

Fields that are undefined for a particular file system are set to 0. fstatfs() returns the same information about an open file referenced by descriptor fd.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

Error Code	描述
EACCES	(statfs()) Search permission is denied for a component of the path prefix of path. (See also path_resolution(2).)
EBADF	(fstatfs()) fd is not a valid open file descriptor.
EFAULT	buf or path points to an invalid address.
EINTR	This call was interrupted by a signal.
EIO	An I/O error occurred while reading from the file system.
ELOOP	(statfs()) Too many symbolic links were encountered in translating path.
ENAMETOOLONG	(statfs()) path is too long.
ENOENT	(statfs()) The file referred to by path does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOSYS	The file system does not support this call.
ENOTDIR	(statfs()) A component of the path prefix of path is not a directory.
EOVERFLOW	Some values were too large to be represented in the returned struct.

遵循于

The Linux statfs() was inspired by the 4.4BSD one (but they do not use the same structure).

注意

The kernel has system calls statfs(), fstatfs(), statfs64(), and fstatfs64() to support this library call.

Some systems only have <sys/vfs.h>, other systems also have <sys/statfs.h>, where the former includes the latter. So it seems including the former is the best choice.

LSB has deprecated the library calls statfs() and fstatfs() and tells us to use statvfs() and fstatvfs() instead.

f_fsid 字段

Solaris, Irix and POSIX have a system callstatvfs(2) that returns astruct statvfs (defined in <sys/statvfs.h>) containing an unsigned long f_fsid. Linux, SunOS, HP-UX, 4.4BSD have a system call statfs() that returns a struct statfs (defined in <sys/vfs.h>) containing afsid_t f_fsid, where fsid_t is defined as struct { int val[2]; }. The same holds for FreeBSD, except that it uses the include file <sys/mount.h>.

The general idea is that f_fsid contains some random stuff such that the pair (f_fsid,ino) uniquely determines a file. Some OSes use (a variation on) the device number, or the device number combined with the filesystem type. Several OSes restrict giving out thef_fsid field to the superuser only (and zero it for unprivileged users), because this field is used in the filehandle of the filesystem when NFS-exported, and giving it out is a security concern.

Under some OSes the fsid can be used as second parameter to the b>sysfs() system call.

另请参阅

stat()函数

stat, fstat, lstat - 获取文件状态

内容简介

#include <sys/types.h>

#include <sys/stat.h>

#include <unistd.h>

int stat(const char *
path
, struct stat *
buf
);

int fstat(int
filedes
, struct stat *
buf
);

int lstat(const char *
path
, struct stat *
buf
);

描述

These functions return information about a file. No permissions are required on the file itself, but — in the case of stat() and lstat() — execute (search) permission is required on all of the directories in path that lead to the file.

stat() stats the file pointed to by path and fills in buf. lstat() is identical to stat(), except that if path is a symbolic link, then the link itself is stat-ed, not the file that it refers to.

fstat() is identical to stat(), except that the file to be stat-ed is specified by the file descriptor filedes.

All of these system calls return a stat structure, which contains the following fields:

struct stat {
dev_t st_dev; /* ID of device containing file */
ino_t st_ino; /* inode number */
mode_t st_mode; /* protection */
nlink_t st_nlink; /* number of hard links */
uid_t st_uid; /* user ID of owner */
gid_t st_gid; /* group ID of owner */
dev_t st_rdev; /* device ID (if special file) */
off_t st_size; /* total size, in bytes */
blksize_t st_blksize; /* blocksize for filesystem I/O */
blkcnt_t st_blocks; /* number of blocks allocated */
time_t st_atime; /* time of last access */
time_t st_mtime; /* time of last modification */
time_t st_ctime; /* time of last status change */
};

The st_dev field describes the device on which this file resides.

The st_rdev field describes the device that this file (inode) represents.

The st_size field gives the size of the file (if it is a regular file or a symbolic link) in bytes. The size of a symlink is the length of the pathname it contains, without a trailing null byte.

The st_blocks field indicates the number of blocks allocated to the file, 512-byte units. (This may be smaller than st_size/512, for example, when the file has holes.)

The st_blksize field gives the "preferred" blocksize for efficient file system I/O. (Writing to a file in smaller chunks may cause an inefficient read-modify-rewrite.)

Not all of the Linux filesystems implement all of the time fields. Some file system types allow mounting in such a way that file accesses do not cause an update of the st_atimefield. (See ‘noatime’ in mount(8).)

The field st_atime is changed by file accesses, e.g. by execve(2), mknod(2), pipe(2),utime(2) and read(2) (of more than zero bytes). Other routines, like mmap(2), may or may not update st_atime.

The field st_mtime is changed by file modifications, e.g. by mknod(2), truncate(2),utime(2) and write(2) (of more than zero bytes). Moreover, st_mtime of a directory is changed by the creation or deletion of files in that directory. The st_mtime field is notchanged for changes in owner, group, hard link count, or mode.

The field st_ctime is changed by writing or by setting inode information (i.e., owner, group, link count, mode, etc.).

The following POSIX macros are defined to check the file type using the st_mode field:

标签	描述
S_ISREG(m)	is it a regular file?
S_ISDIR(m)	directory?
S_ISCHR(m)	character device?
S_ISBLK(m)	block device?
S_ISFIFO(m)	FIFO (named pipe)?
S_ISLNK(m)	symbolic link? (Not in POSIX.1-1996.)
S_ISSOCK(m)	socket? (Not in POSIX.1-1996.)

以下标志被定义为st_mode字段：

S_IFMT	0170000	bitmask for the file type bitfields
S_IFSOCK	0140000	socket
S_IFLNK	0120000	symbolic link
S_IFREG	0100000	regular file
S_IFBLK	0060000	block device
S_IFDIR	0040000	directory
S_IFCHR	0020000	character device
S_IFIFO	0010000	FIFO
S_ISUID	0004000	set UID bit
S_ISGID	0002000	set-group-ID bit (see below)
S_ISVTX	0001000	sticky bit (see below)
S_IRWXU	00700	mask for file owner permissions
S_IRUSR	00400	owner has read permission
S_IWUSR	00200	owner has write permission
S_IXUSR	00100	owner has execute permission
S_IRWXG	00070	mask for group permissions
S_IRGRP	00040	group has read permission
S_IWGRP	00020	group has write permission
S_IXGRP	00010	group has execute permission
S_IRWXO	00007	mask for permissions for others (not in group)
S_IROTH	00004	others have read permission
S_IWOTH	00002	others have write permission
S_IXOTH	00001	others have execute permission

The set-group-ID bit (S_ISGID) has several special uses. For a directory it indicates that BSD semantics is to be used for that directory: files created there inherit their group ID from the directory, not from the effective group ID of the creating process, and directories created there will also get the S_ISGID bit set. For a file that does not have the group execution bit (S_IXGRP) set, the set-group-ID bit indicates mandatory file/record locking.

The ‘sticky’ bit (S_ISVTX) on a directory means that a file in that directory can be renamed or deleted only by the owner of the file, by the owner of the directory, and by a privileged process.

Linux注意事项

Since kernel 2.5.48, the stat structure supports nanosecond resolution for the three file timestamp fields. Glibc exposes the nanosecond component of each field using names either of the form st_atim.tv_nsec, if the _BSD_SOURCE or _SVID_SOURCE feature test macro is defined, or of the form st_atimensec, if neither of these macros is defined. On file systems that do not support sub-second timestamps, these nanosecond fields are returned with the value 0.

For most files under the /proc directory, stat() does not return the file size in the st_sizefield; instead the field is returned with the value 0.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EACCES	Search permission is denied for one of the directories in the path prefix of path. (See also path_resolution(2).)
EBADF	filedes is bad.
EFAULT	Bad address.
ELOOP	Too many symbolic links encountered while traversing the path.
ENAMETOOLONG	File name too long.
ENOENT	A component of the path path does not exist, or the path is an empty string.
ENOMEM	Out of memory (i.e. kernel memory).
ENOTDIR	A component of the path is not a directory.

遵循于

These system calls conform to SVr4, 4.3BSD, POSIX.1-2001.

Use of the st_blocks and st_blksize fields may be less portable. (They were introduced in BSD. The interpretation differs between systems, and possibly on a single system when NFS mounts are involved.)

POSIX does not describe the S_IFMT, S_IFSOCK, S_IFLNK, S_IFREG, S_IFBLK, S_IFDIR, S_IFCHR, S_IFIFO, S_ISVTX bits, but instead demands the use of the macros S_ISDIR(), etc. The S_ISLNK and S_ISSOCK macros are not in POSIX.1-1996, but both are present in POSIX.1-2001; the former is from SVID 4, the latter from SUSv2.

Unix V7 (and later systems) had S_IREAD, S_IWRITE, S_IEXEC, where POSIX prescribes the synonyms S_IRUSR, S_IWUSR, S_IXUSR.

其它系统

Values that have been (or are) in use on various systems:

hex	name	ls	octal	description
f000	S_IFMT		170000	mask for file type
0000			000000	SCO out-of-service inode, BSD unknown type
				SVID-v2 and XPG2 have both 0 and 0100000 for ordinary file
1000	S_IFIFO	p\|	010000	FIFO (named pipe)
2000	S_IFCHR	c	020000	character special (V7)
3000	S_IFMPC		030000	multiplexed character special (V7)
4000	S_IFDIR	d/	040000	directory (V7)
5000	S_IFNAM		050000	XENIX named special file
				with two subtypes, distinguished by st_rdev values 1, 2:
0001	S_INSEM	s	000001	XENIX semaphore subtype of IFNAM
0002	S_INSHD	m	000002	XENIX shared data subtype of IFNAM
6000	S_IFBLK	b	060000	block special (V7)
7000	S_IFMPB		070000	multiplexed block special (V7)
8000	S_IFREG	-	100000	regular (V7)
9000	S_IFCMP		110000	VxFS compressed
9000	S_IFNWK	n	110000	network special (HP-UX)
a000	S_IFLNK	l@	120000	symbolic link (BSD)
b000	S_IFSHAD		130000	Solaris shadow inode for ACL (not seen by userspace)
c000	S_IFSOCK	s=	140000	socket (BSD; also "S_IFSOC" on VxFS)
d000	S_IFDOOR	D>	150000	Solaris door
e000	S_IFWHT	w%	160000	BSD whiteout (not used for inode)
0200	S_ISVTX		001000	‘sticky bit’: save swapped text even after use (V7)
				reserved (SVID-v2)
				On non-directories: don’t cache this file (SunOS)
				On directories: restricted deletion flag (SVID-v4.2)
0400	S_ISGID		002000	set-group-ID on execution (V7)
				for directories: use BSD semantics for propagation of GID
0400	S_ENFMT		002000	SysV file locking enforcement (shared with S_ISGID)
0800	S_ISUID		004000	set-user-ID on execution (V7)
0800	S_CDF		004000	directory is a context dependent file (HP-UX)

A sticky command appeared in Version 32V AT&T UNIX.

另请参阅

statvfs()函数

statvfs, fstatvfs - 获取文件系统统计信息

内容简介

#include <sys/statvfs.h>
int statvfs(const char *
path
, struct statvfs *
buf
);

int fstatvfs(int
fd
, struct statvfs *
buf
);

描述

The function statvfs() returns information about a mounted file system. path is the pathname of any file within the mounted filesystem. buf is a pointer to a statvfsstructure defined approximately as follows:

struct statvfs {
unsigned long f_bsize; /* file system block size */
unsigned long f_frsize; /* fragment size */
fsblkcnt_t f_blocks; /* size of fs in f_frsize units */
fsblkcnt_t f_bfree; /* # free blocks */
fsblkcnt_t f_bavail; /* # free blocks for non-root */
fsfilcnt_t f_files; /* # inodes */
fsfilcnt_t f_ffree; /* # free inodes */
fsfilcnt_t f_favail; /* # free inodes for non-root */
unsigned long f_fsid; /* file system ID */
unsigned long f_flag; /* mount flags */
unsigned long f_namemax; /* maximum filename length */
};

Here the types fsblkcnt_t and fsfilcnt_t are defined in <sys/types.h>. Both used to beunsigned long.

The field f_flag is a bit mask (of mount flags, see mount(8)). Bits defined by POSIX are

标签	描述
ST_RDONLY	Read-only file system.
ST_NOSUID	Set-user-ID/set-group-ID bits are ignored by exec(2).

它是不确定的返回结构的所有成员是否对所有文件系统有意义的值。

fstatvfs() 返回有关由描述符fd指定打开的文件相同的信息。

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

Error Code	描述
EACCES	(statvfs()) Search permission is denied for a component of the path prefix of path. (See also path_resolution(2).)
EBADF	(fstatvfs()) fd is not a valid open file descriptor.
EFAULT	Buf or path points to an invalid address.
EINTR	This call was interrupted by a signal.
EIO	An I/O error occurred while reading from the file system.
ELOOP	(statvfs()) Too many symbolic links were encountered in translating path.
ENAMETOOLONG	(statvfs()) path is too long.
ENOENT	(statvfs()) The file referred to by path does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOSYS	The file system does not support this call.
ENOTDIR	(statvfs()) A component of the path prefix of path is not a directory.
EOVERFLOW	Some values were too large to be represented in the returned struct.

遵循于

Solaris, Irix, POSIX.1-2001

注意

The Linux kernel has system calls statfs() and fstatfs() to support this library call.

The current glibc implementation of

pathconf(path, _PC_REC_XFER_ALIGN);
pathconf(path, _PC_ALLOC_SIZE_MIN);
pathconf(path, _PC_REC_MIN_XFER_SIZE);

uses the f_frsize, f_frsize, and f_bsize fields of the return value of statvfs(path,buf).

另请参阅

statfs (2)

fsync()函数

fsync, fdatasync - 同步文件在内核态与存储设备

内容简介

#include <unistd.h>
int fsync(int
fd
);
int fdatasync(int
fd
);

描述

fsync() transfers ("flushes") all modified in-core data of (i.e., modified buffer cache pages for) the file referred to by the file descriptor fd to the disk device (or other permanent storage device) where that file resides. The call blocks until the device reports that the transfer has completed. It also flushes metadata information associated with the file (see stat(2)).

Calling fsync() does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicit fsync() on a file descriptor for the directory is also needed.

fdatasync() is similar to fsync(), but does not flush modified metadata unless that metadata is needed in order to allow a subsequent data retrieval to be correctly handled. For example, changes to st_atime or st_mtime (respectively, time of last access and time of last modification; see stat(2)) do not not require flushing because they are not necessary for a subsequent data read to be handled correctly. On the other hand, a change to the file size (st_size, as made by say ftruncate(2)), would require a metadata flush.

The aim of fdatasync(2) is to reduce disk activity for applications that do not require all metadata to be synchronised with the disk.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EBADF	fd is not a valid file descriptor open for writing.
EIO	An error occurred during synchronization.
EROFS, EINVAL	fd is bound to a special file which does not support synchronization.

注意

If the underlying hard disk has write caching enabled, then the data may not really be on permanent storage when fsync() / fdatasync() return.

When an ext2 file system is mounted with the sync option, directory entries are also implicitly synced by fsync().

On kernels before 2.4, fsync() on big files can be inefficient. An alternative might be to use the O_SYNC flag to open(2).

遵循于

POSIX.1-2001

另请参阅

truncate()函数

truncate, ftruncate - 截断一个文件到指定的长度

内容简介

#include <unistd.h>

#include <sys/types.h>

int truncate(const char *
path
, off_t
length
);

int ftruncate(int
fd
, off_t
length
);

描述

The truncate() and ftruncate() functions cause the regular file named by path or referenced by fd to be truncated to a size of precisely length bytes.

If the file previously was larger than this size, the extra data is lost. If the file previously was shorter, it is extended, and the extended part reads as null bytes (’\0’). The file offset is not changed.

If the size changed, then the st_ctime and st_mtime fields (respectively, time of last status change and time of last modification; see stat(2)) for the file are updated, and the set-user-ID and set-group-ID permission bits may be cleared.

With ftruncate(), the file must be open for writing; with truncate(), the file must be writable.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

For truncate():

Error Code	描述
EACCES	Search permission is denied for a component of the path prefix, or the named file is not writable by the user. (See alsopath_resolution(2).)
EFAULT	Path points outside the process’s allocated address space.
EFBIG	The argument length is larger than the maximum file size. (XSI)
EINTR	A signal was caught during execution.
EINVAL	The argument length is negative or larger than the maximum file size.
EIO	An I/O error occurred updating the inode.
EISDIR	The named file is a directory.
ELOOP	Too many symbolic links were encountered in translating the pathname.
ENAMETOOLONG	A component of a pathname exceeded 255 characters, or an entire pathname exceeded 1023 characters.
ENOENT	The named file does not exist.
ENOTDIR	A component of the path prefix is not a directory.
EPERM	The underlying file system does not support extending a file beyond its current size.
EROFS	The named file resides on a read-only file system.
ETXTBSY	The file is a pure procedure (shared text) file that is being executed.
For ftruncate() the same errors apply, but instead of things that can be wrong withpath, we now have things that can be wrong with fd:
EBADF	The fd is not a valid descriptor.
EBADF or EINVAL	The fd is not open for writing.
EINVAL	The fd does not reference a regular file.

遵循于

4.4BSD, SVr4, POSIX.1-2001 (these calls first appeared in 4.2BSD).

注意

The above description is for XSI-compliant systems. For non-XSI-compliant systems, the POSIX standard allows two behaviours for ftruncate() when length exceeds the file length (note that truncate() is not specified at all in such an environment): either returning an error, or extending the file.

Like most Unix implementations, Linux follows the XSI requirement when dealing with native file systems. However, some non-native file systems do not permit truncate() and ftruncate() to be used to extend a file beyond its current length: a notable example on Linux is VFAT.

另请参阅

futex()函数

futex - 快速用户空间锁定系统调用

内容简介

#include <linux/futex.h>

#include <sys/time.h>

int futex(int *uaddr, int op, int val, const struct timespec * timeout , int *uaddr2, int val3);

描述

The futex() system call provides a method for a program to wait for a value at a given address to change, and a method to wake up anyone waiting on a particular address (while the addresses for the same memory in separate processes may not be equal, the kernel maps them internally so the same memory mapped in different locations will correspond for futex() calls). It is typically used to implement the contended case of a lock in shared memory, as described in futex(7).

When a futex(7) operation did not finish uncontended in userspace, a call needs to be made to the kernel to arbitrate. Arbitration can either mean putting the calling process to sleep or, conversely, waking a waiting process.

Callers of this function are expected to adhere to the semantics as set out in futex(7). As these semantics involve writing non-portable assembly instructions, this in turn probably means that most users will in fact be library authors and not general application developers.

The uaddr argument needs to yiibai to an aligned integer which stores the counter. The operation to execute is passed via the op parameter, along with a value val.

Five operations are currently defined:

标签	描述
FUTEX_WAIT
	This operation atomically verifies that the futex addressuaddr still contains the value val, and sleeps awaiting FUTEX_WAKE on this futex address. If the timeout argument is non-NULL, its contents describe the maximum duration of the wait, which is infinite otherwise. The arguments uaddr2and val3 are ignored. For futex(7), this call is executed if decrementing the count gave a negative value (indicating contention), and will sleep until another process releases the futex and executes the FUTEX_WAKE operation.
FUTEX_WAKE	This operation wakes at most val processes waiting on this futex address (ie. inside FUTEX_WAIT). The argumentstimeout, uaddr2 and val3 are ignored. For futex(7), this is executed if incrementing the count showed that there were waiters, once the futex value has been set to 1 (indicating that it is available).
FUTEX_FD	To support asynchronous wakeups, this operation associates a file descriptor with a futex. If another process executes a FUTEX_WAKE, the process will receive the signal number that was passed in val. The calling process must close the returned file descriptor after use. The argumentstimeout, uaddr2 and val3 are ignored. To prevent race conditions, the caller should test if the futex has been upped after FUTEX_FD returns.
FUTEX_REQUEUE(since Linux 2.5.70)	This operation was introduced in order to avoid a "thundering herd" effect when FUTEX_WAKE is used and all processes woken up need to acquire another futex. This call wakes up val processes, and requeues all other waiters on the futex at address uaddr2. The arguments timeout andval3 are ignored.
FUTEX_CMP_REQUEUE(since Linux 2.6.7)	There was a race in the intended use of FUTEX_REQUEUE, so FUTEX_CMP_REQUEUE was introduced. This is similar to FUTEX_REQUEUE, but first checks whether the locationuaddr still contains the value val3. If not, an error EAGAIN is returned. The argument timeout is ignored.

返回值

Depending on which operation was executed, the returned value can have differing meanings.

标签	描述
FUTEX_WAIT	Returns 0 if the process was woken by a FUTEX_WAKE call. In case of timeout, ETIMEDOUT is returned. If the futex was not equal to the expected value, the operation returns EWOULDBLOCK. Signals (or other spurious wakeups) cause FUTEX_WAIT to return EINTR.
FUTEX_WAKE	Returns the number of processes woken up.
FUTEX_FD	Returns the new file descriptor associated with the futex.
FUTEX_REQUEUE	Returns the number of processes woken up.
FUTEX_CMP_REQUEUE	Returns the number of processes woken up.

错误

错误代码	描述
EACCES	No read access to futex memory.
EAGAIN	FUTEX_CMP_REQUEUE found an unexpected futex value. (This probably indicates a race; use the safe FUTEX_WAKE now.)
EFAULT	Error in getting timeout information from userspace.
EINVAL	An operation was not defined or error in page alignment.
ENFILE	The system limit on the total number of open files has been reached.

注意

To reiterate, bare futexes are not intended as an easy to use abstraction for end-users. Implementors are expected to be assembly literate and to have read the sources of the futex userspace library referenced below.

版本

Initial futex support was merged in Linux 2.5.7 but with different semantics from what was described above. A 4-parameter system call with the semantics given here was introduced in Linux 2.5.40. In Linux 2.5.70 one parameter was added. In Linux 2.6.7 a sixth parameter was added — messy, especially on the s390 architecture.

遵循于

This system call is Minux specific.

futimesat()函数

futimes - 改变文件的一个相对的时间戳到一个目录文件描述符

内容简介

#include <fcntl.h>

int futimesat(int dirfd, const char *path, const struct timeval times[2]);

描述

The futimesat() system call operates in exactly the same way as utimes(2), except for the differences described in this manual page.

If the pathname given in pathname is relative, then it is interpreted relative to the directory referred to by the file descriptor dirfd (rather than relative to the current working directory of the calling process, as is done by utimes(2) for a relative pathname).

If the pathname given in pathname is relative and dirfd is the special value AT_FDCWD, then pathname is interpreted relative to the current working directory of the calling process (like utimes(2)).

If the pathname given in pathname is absolute, then dirfd is ignored.

返回值

On success, futimesat() returns a 0. On error, -1 is returned and errno is set to indicate the error.

错误

The same errors that occur for utimes(2) can also occur for futimesat(). The following additional errors can occur for futimesat():

标签	描述
EBADF	dirfd is not a valid file descriptor.
ENOTDIR	pathname is a relative path and dirfd is a file descriptor referring to a file other than a directory.

遵循于

This system call is non-standard but is proposed for inclusion in a future revision of POSIX.1. A similar system call exists on Solaris.

GLIBC 注意

If the path argument is NULL, then the glibc futimes() wrapper function updates the times for the file referred to by dirfd.

版本

futimesat() was added to Linux in kernel 2.6.16.

另请参阅

getcontext()函数

getcontext, setcontext - 获取或设置用户环境

内容简介

#include <ucontext.h>

int getcontext(ucontext_t *
ucp
);

int setcontext(const ucontext_t *
ucp
);

where:

标签	描述
ucp	points to a structure defined in <ucontext.h> containing the signal mask, execution stack, and machine registers.

描述

getcontext(2) gets the current context of the calling process, storing it in the ucontext struct pointed to by ucp.

setcontext(2) sets the context of the calling process to the state stored in the ucontext struct pointed to by ucp. The struct must either have been created by getcontext(2) or have been passed as the third parameter of the sigaction(2) signal handler.

The ucontext struct created by getcontext(2) is defined in <ucontext.h> as follows:

typedef struct ucontext
{
unsigned long int uc_flags;
struct ucontext *uc_link;
stack_t uc_stack;
mcontext_t uc_mcontext;
__sigset_t uc_sigmask;
struct _fpstate __fpregs_mem;
} ucontext_t;

RETURN VALUES

getcontext(2) returns 0 on success and -1 on failure. setcontext(2) does not return a value on success and returns -1 on failure.

STANDARDS

These functions comform to: XPG4-UNIX.

注意

When a signal handler executes, the current user context is saved and a new context is created by the kernel. If the calling process leaves the signal handler using longjmp(2), the original context cannot be restored, and the result of future calls to getcontext(2) are unpredictable. To avoid this problem, use siglongjmp(2) or setcontext(2) in signal handlers instead of longjmp(2).

另请参阅

sigaltstack(2), sigprocmask(2), sigsetjmp(3), setjmp(3).

getcwd()函数

getcwd - 获取当前工作目录

内容简介

/*
* This page documents the getcwd(2) system call, which
* is not defined in any user-space header files; you should
* use getcwd(3) defined in <unistd.h> instead in applications.
*/

long getcwd(char *buf, unsigned long size);

描述

The getcwd() function copies an absolute pathname of the current working directory to the array pointed to by buf, which is of length size.

If the current absolute path name would require a buffer longer than size elements, -1is returned, and errno is set to ERANGE; an application should check for this error, and allocate a larger buffer if necessary.

If buf is NULL, the behaviour of getcwd() is undefined.

返回值

-1 on failure (for example, if the current directory is not readable), with errno set accordingly, and the number of characters stored in buf on success. The contents of the array pointed to by buf is undefined on error.

Note that this return value differs from the getcwd(3) library function, which returnsNULL on failure and the address of buf on success.

错误

标签	描述
ENOMEM
	if user memory cannot be mapped
ENOENT
	if directory does not exist (i.e. it has been deleted)
ERANGE
	if not enough space available for storing the path
EFAULT
	if memory access violation occurs while copying

遵循于

The getcwd system call is Linux specific, use the getcwd C library function for portability.

另请参阅

getdents()函数

getdents - 获得目录项

内容简介

#include <unistd.h>
#include <linux/types.h>
#include <linux/dirent.h>
#include <linux/unistd.h>
#include <errno.h>

int getdents(unsigned int fd, struct dirent *dirp, unsigned int count);

描述

This is not the function you are interested in. Look at readdir(3) for the POSIX conforming C library interface. This page documents the bare kernel system call interface.

The system call getdents() reads several dirent structures from the directory pointed at by fd into the memory area pointed to by dirp. The parameter count is the size of the memory area.

The dirent structure is declared as follows:

struct dirent {
long d_ino; /* inode number */
off_t d_off; /* offset to next
dirent
*/
unsigned short d_reclen; /* length of this
dirent
*/
char d_name [NAME_MAX+1]; /* filename (null-terminated) */
}

d_ino is an inode number. d_off is the distance from the start of the directory to the start of the next dirent. d_reclen is the size of this entire dirent. d_name is a null-terminated filename.

This call supersedes readdir(2).

返回值

On success, the number of bytes read is returned. On end of directory, 0 is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EBADF	Invalid file descriptor fd.
EFAULT	Argument points outside the calling process’s address space.
EINVAL	Result buffer is too small.
ENOENT	No such directory.
ENOTDIR
	File descriptor does not refer to a directory.

遵循于

SVr4.

注意

Glibc does not provide a wrapper for this system call; call it using syscall(2).

另请参阅

readdir (2)

getdomainname()函数

getdomainname, setdomainname -获取/设置域名

内容简介

#include <unistd.h>

int getdomainname(char *
name
, size_t
len
);

int setdomainname(const char *
name
, size_t
len
);

描述

These functions are used to access or to change the domain name of the current processor. If the null-terminated domain name requires more than len bytes,getdomainname() returns the first len bytes (glibc) or returns an error (libc).

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EFAULT	For setdomainname(): name yiibaied outside of user address space.
EINVAL	For getdomainname() under libc: name is NULL or name is longer than len bytes.
EINVAL	For setdomainname(): len was negative or too large.
EPERM	For setdomainname(): the caller is unprivileged (Linux: does not have the CAP_SYS_ADMIN capability).

遵循于

POSIX does not specify these calls.

另请参阅

getdtablesize()函数

getdtablesize - 获取描述符表的大小

内容简介

#include <unistd.h>

int getdtablesize(void);

描述

getdtablesize() 返回文件的最大数量进程可以打开的，比一个文件描述符的最大可能值多一个。

返回值

当前限制在每个进程打开的文件数。

注意

getdtablesize() is implemented as a libc library function. The glibc version callsgetrlimit(2) and returns the current RLIMIT_NOFILE limit, or OPEN_MAX when that fails. The libc4 and libc5 versions return OPEN_MAX (set to 256 since Linux 0.98.4).

遵循于

SVr4, 4.4BSD (the getdtablesize() function first appeared in 4.2BSD).

另请参阅

getgid()函数

getgid, getegid - 获得组标识

内容简介

#include <unistd.h>

#include <sys/types.h>

gid_t getgid(void);

gid_t getegid(void);

描述

getgid() 返回当前进程的实际组ID。

getegid() 返回当前进程的有效组ID。

错误

这些函数总是成功的。

遵循于

POSIX.1-2001, 4.3BSD

另请参阅

getuid()函数

getuid, geteuid - 获取用户标识

内容简介

#include <unistd.h>

#include <sys/types.h>

uid_t getuid(void);

uid_t geteuid(void);

描述

getuid() 返回当前进程的真实用户ID。

geteuid() 返回当前进程的有效用户ID。

错误

这些函数总是成功的。

遵循于

POSIX.1-2001, 4.3BSD.

历史

In Unix V6 the getuid() call returned (euid << 8) + uid. Unix V7 introduced separate callsgetuid() and geteuid().

另请参阅

getgroups()函数

getgroups, setgroups - 补充组的get/set ID列表

内容简介

#include <sys/types.h>

#include <unistd.h>

int getgroups(int
size
, gid_t
list
[]);

#include <grp.h>

int setgroups(size_t
size
, const gid_t *
list
);

描述

标签	描述
getgroups()
	Up to size supplementary group IDs (of the calling process) are returned in list. It is unspecified whether the effective group ID of the calling process is included in the returned list. (Thus, an application should also call getegid(2) and add or remove the resulting value.) If size is zero, list is not modified, but the total number of supplementary group IDs for the process is returned.
setgroups()
	Sets the supplementary group IDs for the process. Appropriate privileges (Linux: the CAP_SETGID capability) are required.

返回值

标签	描述
getgroups()
	On success, the number of supplementary group IDs is returned. On error, -1 is returned, and errno is set appropriately.
setgroups()
	On success, zero is returned. On error, -1 is returned, and errnois set appropriately.

错误

标签	描述
EFAULT	list has an invalid address.
EINVAL	For setgroups(), size is greater than NGROUPS (32 for Linux 2.0.32). For getgroups(), size is less than the number of supplementary group IDs, but is not zero.
EPERM	The calling process has insufficient privilege to call setgroups().

注意

A process can have up to at least NGROUPS_MAX supplementary group IDs in addition to the effective group ID. The set of supplementary group IDs is inherited from the parent process and may be changed using setgroups(). The maximum number of supplementary group IDs can be found using sysconf(3):

long ngroups_max;
ngroups_max = sysconf(_SC_NGROUPS_MAX);

The maximal return value of getgroups() cannot be larger than one more than the value obtained this way.

The prototype for setgroups() is only available if _BSD_SOURCE is defined.

遵循于

SVr4, 4.3BSD. The getgroups() function is in POSIX.1-2001. Since setgroups() requires privilege, it is not covered by POSIX.1-2001.

另请参阅

getgroups()函数

gethostid, sethostid - 获取或设置当前主机的唯一标识

内容简介

#include <unistd.h>

long gethostid(void);

int sethostid(long
hostid
);

描述

Get or set a unique 32-bit identifier for the current machine. The 32-bit identifier is intended to be unique among all UNIX systems in existence. This normally resembles the Internet address for the local machine, as returned by gethostbyname(3), and thus usually never needs to be set.

The sethostid() call is restricted to the superuser.

The hostid argument is stored in the file /etc/hostid.

返回值

gethostid() returns the 32-bit identifier for the current host as set by sethostid(2).

遵循于

4.2BSD; these functions were dropped in 4.4BSD. SVr4 includes gethostid() but notsethostid(). POSIX.1-2001 specifies gethostid() but not sethostid().

文件

/etc/hostid

示例

id = gethostid ();
/* This is a no-op unless unsigned int is wider than 32 bits. */ id &= 0xffffffff;

另请参阅

hostid (1)

gethostname()函数

gethostname, sethostname - 获取/设置主机名

内容简介

#include <unistd.h>

int gethostname(char *
name
, size_t
len
);

int sethostname(const char *
name
, size_t
len
);

描述

These system calls are used to access or to change the host name of the current processor. The gethostname() system call returns a null-terminated hostname (set earlier by sethostname()) in the array name that has a length of len bytes. In case the null-terminated hostname does not fit, no error is returned, but the hostname is truncated. It is unspecified whether the truncated hostname will be null-terminated.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EFAULT	name is an invalid address.
EINVAL	len is negative or, for sethostname(), len is larger than the maximum allowed size, or, for gethostname() on Linux/i386, lenis smaller than the actual size. (In this last case glibc 2.1 uses ENAMETOOLONG.)
EPERM	For sethostname(), the caller did not have theCAP_SYS_ADMIN capability.

遵循于

SVr4, 4.4BSD (this interfaces first appeared in 4.2BSD). POSIX.1-2001 specifiesgethostname() but not sethostname().

注意

SUSv2 guarantees that ‘Host names are limited to 255 bytes’. POSIX.1-2001 guarantees that ‘Host names (not including the terminating null byte) are limited to HOST_NAME_MAX bytes’.

glibc注意事项

The GNU C library implements gethostname() as a library function that calls uname(2) and copies up to len bytes from the returned nodename field into name. Having performed the copy, the function then checks if the length of the nodename was greater than or equal to len, and if it is, then the function returns -1 with errno set toENAMETOOLONG. Versions of glibc before 2.2 handle the case where the length of thenodename was greater than or equal to len differently: nothing is copied into name and the function returns -1 with errno set to ENAMETOOLONG.

另请参阅

getitimer()函数

getitimer, setitimer - 获取或设置一个间隔定时器的值

内容简介

#include <sys/time.h>

int getitimer(int which, struct itimerval *value);
int setitimer(int which, const struct itimerval *value, struct itimerval *ovalue);

描述

该系统为每个进程有三个间隔定时器，在不同的时间域的每个递减。当任何定时器到期时，一信号被发送到处理，定时器（可能）重新启动。

标签	描述
ITIMER_REAL	decrements in real time, and delivers SIGALRM upon expiration.
ITIMER_VIRTUAL	decrements only when the process is executing, and deliversSIGVTALRM upon expiration.
ITIMER_PROF	decrements both when the process executes and when the system is executing on behalf of the process. Coupled withITIMER_VIRTUAL, this timer is usually used to profile the time spent by the application in user and kernel space. SIGPROF is delivered upon expiration.

计时器的值由以下结构定义：

struct itimerval {
struct timeval it_interval; /* next value */
struct timeval it_value; /* current value */
};
struct timeval {
long tv_sec; /* seconds */
long tv_usec; /* microseconds */
};

The function getitimer() fills the structure indicated by value with the current setting for the timer indicated by which (one of ITIMER_REAL, ITIMER_VIRTUAL, orITIMER_PROF). The element it_value is set to the amount of time remaining on the timer, or zero if the timer is disabled. Similarly, it_interval is set to the reset value. The function setitimer() sets the indicated timer to the value in value. If ovalue is non-zero, the old value of the timer is stored there.

Timers decrement from it_value to zero, generate a signal, and reset to it_interval. A timer which is set to zero (it_value is zero or the timer expires and it_interval is zero) stops.

Both tv_sec and tv_usec are significant in determining the duration of a timer.

Timers will never expire before the requested time, but may expire some (short) time afterwards, which depends on the system timer resolution and on the system load. (But see BUGS below.) Upon expiration, a signal will be generated and the timer reset. If the timer expires while the process is active (always true for ITIMER_VIRTUAL) the signal will be delivered immediately when generated. Otherwise the delivery will be offset by a small time dependent on the system loading.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EFAULT	value or ovalue are not valid pointers.
EINVAL	which is not one of ITIMER_REAL, ITIMER_VIRTUAL, orITIMER_PROF.

注意

A child created via fork(2) does not inherit its parent’s interval timers. Interval timers are preserved across an execve(2).

遵循于

POSIX.1-2001, SVr4, 4.4BSD (this call first appeared in 4.2BSD).

另请参阅

The generation and delivery of a signal are distinct, and only one instance of each of the signals listed above may be pending for a process. Under very heavy loading, an ITIMER_REAL timer may expire before the signal from a previous expiration has been delivered. The second signal in such an event will be lost.

On Linux, timer values are represented in jiffies. If a request is made set a timer with a value whose jiffies representation exceeds MAX_SEC_IN_JIFFIES (defined ininclude/linux/jiffies.h), then the timer is silently truncated to this ceiling value. On Linux/x86 (where, since kernel 2.6.13, the default jiffy is 0.004 seconds), this means that the ceiling value for a timer is approximately 99.42 days.

On certain systems (including x86), Linux kernels before version 2.6.12 have a bug which will produce premature timer expirations of up to one jiffy under some circumstances. This bug is fixed in kernel 2.6.12.

POSIX.1-2001 says that setitimer() should fail if a tv_usec value is specified that is outside of the range 0 to 999999. However, Linux does not give an error, but instead silently adjusts the corresponding seconds value for the timer. In the future (scheduled for March 2007), this non-conformance will be repaired: existing applications should be fixed now to ensure that they supply a properly formed tv_usec value.

get_kernel_syms()函数

get_kernel_syms -检索导出的内核和模块的符号

内容简介

#include <linux/module.h>

int get_kernel_syms(struct kernel_sym *table);

描述

如果 table 为 NULL, get_kernel_syms() 返回可用于查询符号的数目。否则填充结构的一个表：

struct kernel_sym {
unsigned long value;
char name[60];
};

The symbols are interspersed with magic symbols of the form #module-name with the kernel having an empty name. The value associated with a symbol of this form is the address at which the module is loaded.

The symbols exported from each module follow their magic module tag and the modules are returned in the reverse of the order in which they were loaded.

返回值

Returns the number of symbols copied to table. There is no possible error return.

遵循于

get_kernel_syms() is Linux specific.

BUGS

There is no way to indicate the size of the buffer allocated for table. If symbols have been added to the kernel since the program queried for the symbol table size, memory will be corrupted.

The length of exported symbol names is limited to 59 characters.

Because of these limitations, this system call is deprecated in favor of query_module(2) (which is itself nowadays deprecated in favor of other interfaces described on its manual page).

注意

This system call is only present on Linux up until kernel 2.4; it was removed in Linux 2.6.

另请参阅

unimplemented()函数

afs_syscall, break, fattach, fdetach, ftime, getmsg, getpmsg, gtty, isastream, lock, mpx, multiplexer, prof, profil, putmsg, putpmsg, security, stty, ulimit, vserver - 未实现系统调用

内容简介

未实现系统调用。

描述

这些系统调用中不执行在 Linux 2.4 kernel.

返回值

These system calls always return -1 and set errno to ENOSYS.

注意

Note that ftime(3), profil(3) and ulimit(3) are implemented as library functions.

Some system calls, like alloc_hugepages(2), free_hugepages(2), ioperm(2), iopl(2), and vm86(2) only exist on certain architectures.

Some system calls, like ipc(2), create_module(2), init_module(2), anddelete_module(2) only exist when the Linux kernel was built with support for them.

另请参阅

obsolete (2)

getpagesize()函数

getpagesize - 获取内存页面大小

内容简介

#include <unistd.h>

int getpagesize(void);

描述

The function getpagesize() returns the number of bytes in a page, where a "page" is the thing used where it says in the description of mmap(2) that files are mapped in page-sized units.

The size of the kind of pages that mmap() uses, is found using

#include <unistd.h>
long sz = sysconf(_SC_PAGESIZE);

(where some systems also allow the synonym _SC_PAGE_SIZE for _SC_PAGESIZE), or

#include <unistd.h>
int sz = getpagesize();

HISTORY

This call first appeared in 4.2BSD.

遵循于

SVr4, 4.4BSD, SUSv2. In SUSv2 the getpagesize() call is labeled LEGACY, and in POSIX.1-2001 it has been dropped. HP-UX does not have this call.

注意

Whether getpagesize() is present as a Linux system call depends on the architecture. If it is, it returns the kernel symbol PAGE_SIZE, which is architecture and machine model dependent. Generally, one uses binaries that are architecture but not machine model dependent, in order to have a single binary distribution per architecture. This means that a user program should not find PAGE_SIZE at compile time from a header file, but use an actual system call, at least for those architectures (like sun4) where this dependency exists. Here libc4, libc5, glibc 2.0 fail because their getpagesize() returns a statically derived value, and does not use a system call. Things are OK in glibc 2.1.

另请参阅

mmap (2)

getpeername()函数

getpeername - 获取连接的对等套接字的名称

内容简介

#include <sys/socket.h>

int getpeername(int
s
, struct sockaddr *
name
, socklen_t *
namelen
);

描述

getpeername() 返回连接到套接字s的同伴的名字。namelen 参数应被初始化，以指示的空间指向金额的名字。返回时它包含（以字节为单位）返回的名称的实际大小。该名称被截断，如果提供的缓冲区太小。

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EBADF	The argument s is not a valid descriptor.
EFAULT	The name parameter yiibais to memory not in a valid part of the process address space.
EINVAL	namelen is invalid (e.g., is negative).
ENOBUFS
	Insufficient resources were available in the system to perform the operation.
ENOTCONN
	The socket is not connected.
ENOTSOCK
	The argument s is a file, not a socket.

遵循于

SVr4, 4.4BSD (the getpeername() function call first appeared in 4.2BSD), POSIX.1-2001.

注意

The third argument of getpeername() is in reality an int * (and this is what 4.x BSD and libc4 and libc5 have). Some POSIX confusion resulted in the present socklen_t, also used by glibc. See also accept(2).

另请参阅

setpgid()函数

setpgid, getpgid, setpgrp, getpgrp - 设置/获取进程组

内容简介

#include <unistd.h>

int setpgid(pid_t
pid
, pid_t
pgid
);

pid_t getpgid(pid_t
pid
);

int setpgrp(void);

pid_t getpgrp(void);

描述

setpgid() sets the process group ID of the process specified by pid to pgid. If pid is zero, the process ID of the current process is used. If pgid is zero, the process ID of the process specified by pid is used. If setpgid() is used to move a process from one process group to another (as is done by some shells when creating pipelines), both process groups must be part of the same session. In this case, the pgid specifies an existing process group to be joined and the session ID of that group must match the session ID of the joining process.

getpgid() returns the process group ID of the process specified by pid. If pid is zero, the process ID of the current process is used.

The call setpgrp() is equivalent to setpgid(0,0).

Similarly, getpgrp() is equivalent to getpgid(0) . Each process group is a member of a session and each process is a member of the session of which its process group is a member.

Process groups are used for distribution of signals, and by terminals to arbitrate requests for their input: Processes that have the same process group as the terminal are foreground and may read, while others will block with a signal if they attempt to read. These calls are thus used by programs such as csh(1) to create process groups in implementing job control. The TIOCGPGRP and TIOCSPGRP calls described intermios(3) are used to get/set the process group of the control terminal.

If a session has a controlling terminal, CLOCAL is not set and a hangup occurs, then the session leader is sent a SIGHUP. If the session leader exits, the SIGHUP signal will be sent to each process in the foreground process group of the controlling terminal.

If the exit of the process causes a process group to become orphaned, and if any member of the newly-orphaned process group is stopped, then a SIGHUP signal followed by a SIGCONT signal will be sent to each process in the newly-orphaned process group.

返回值

On success, setpgid() and setpgrp() return zero. On error, -1 is returned, and errno is set appropriately.

getpgid() returns a process group on success. On error, -1 is returned, and errno is set appropriately.

getpgrp() always returns the current process group.

错误

标签	描述
EACCES	An attempt was made to change the process group ID of one of the children of the calling process and the child had already performed an execve() (setpgid(), setpgrp()).
EINVAL	pgid is less than 0 (setpgid(), setpgrp()).
EPERM	An attempt was made to move a process into a process group in a different session, or to change the process group ID of one of the children of the calling process and the child was in a different session, or to change the process group ID of a session leader (setpgid(), setpgrp()).
ESRCH	For getpgid(): pid does not match any process. For setpgid(): pidis not the current process and not a child of the current process.

遵循于

The functions setpgid() and getpgrp() conform to POSIX.1-2001. The function setpgrp() is from 4.2BSD. The function getpgid() conforms to SVr4.

注意

A child created via fork(2) inherits its parent’s process group ID. The process group ID is preserved across an execve(2).

POSIX took setpgid() from the BSD function setpgrp(). Also System V has a function with the same name, but it is identical to setsid(2).

To get the prototypes under glibc, define both _XOPEN_SOURCE and _XOPEN_SOURCE_EXTENDED, or use "#define _XOPEN_SOURCE n" for some integer nlarger than or equal to 500.

另请参阅

getpgrp()函数

setpgid, getpgid, setpgrp, getpgrp - 设置/获取进程组

内容简介

#include <unistd.h>

int setpgid(pid_t pid, pid_t pgid);
pid_t getpgid(pid_t pid);
int setpgrp(void);
pid_t getpgrp(void);

描述

getpgid() returns the process group ID of the process specified by pid. If pid is zero, the process ID of the current process is used.

The call setpgrp() is equivalent to setpgid(0,0).

Similarly, getpgrp() is equivalent to getpgid(0) . Each process group is a member of a session and each process is a member of the session of which its process group is a member.

返回值

On success, setpgid() and setpgrp() return zero. On error, -1 is returned, and errno is set appropriately.

getpgid() returns a process group on success. On error, -1 is returned, and errno is set appropriately.

getpgrp() always returns the current process group.

错误

标签	描述
EACCES	An attempt was made to change the process group ID of one of the children of the calling process and the child had already performed an execve() (setpgid(), setpgrp()).
EINVAL	pgid is less than 0 (setpgid(), setpgrp()).
EPERM	An attempt was made to move a process into a process group in a different session, or to change the process group ID of one of the children of the calling process and the child was in a different session, or to change the process group ID of a session leader (setpgid(), setpgrp()).
ESRCH	For getpgid(): pid does not match any process. For setpgid(): pidis not the current process and not a child of the current process.

遵循于

The functions setpgid() and getpgrp() conform to POSIX.1-2001. The function setpgrp() is from 4.2BSD. The function getpgid() conforms to SVr4.

注意

A child created via fork(2) inherits its parent’s process group ID. The process group ID is preserved across an execve(2).

POSIX took setpgid() from the BSD function setpgrp(). Also System V has a function with the same name, but it is identical to setsid(2).

To get the prototypes under glibc, define both _XOPEN_SOURCE and _XOPEN_SOURCE_EXTENDED, or use "#define _XOPEN_SOURCE n" for some integer nlarger than or equal to 500.

另请参阅

getpid()函数

getpid, getppid - 获取进程标识

内容简介

#include <sys/types.h>
#include <unistd.h>

pid_t getpid(void);
pid_t getppid(void);

描述

getpid() 返回当前进程的进程ID。（这是经常使用的生成唯一的临时文件名的程序。）

getppid() 返回当前进程的父进程ID。

遵循于

POSIX.1-2001, 4.3BSD, SVr4

另请参阅

getpmsg()函数

afs_syscall, break, fattach, fdetach, ftime, getmsg, getpmsg, gtty, isastream, lock, mpx, multiplexer, prof, profil, putmsg, putpmsg, security, stty, ulimit, vserver - 未实现的系统调用

内容简介

未实现系统调用

描述

These system calls are not implemented in the Linux 2.4 kernel.

返回值

These system calls always return -1 and set errno to ENOSYS.

注意

Note that ftime(3), profil(3) and ulimit(3) are implemented as library functions.

Some system calls, like alloc_hugepages(2), free_hugepages(2), ioperm(2), iopl(2), and vm86(2) only exist on certain architectures.

Some system calls, like ipc(2), create_module(2), init_module(2), anddelete_module(2) only exist when the Linux kernel was built with support for them.

另请参阅

obsolete (2)

getppid()函数

getpid, getppid - 获取进程标识

内容简介

#include <sys/types.h>
#include <unistd.h>

pid_t getpid(void);
pid_t getppid(void);

描述

getpid() 返回当前进程的进程ID。（这是经常使用的生成唯一的临时文件名的程序。）

getppid() 返回当前进程的父进程ID。

遵循于

POSIX.1-2001, 4.3BSD, SVr4

另请参阅

getpriority()函数

getpriority, setpriority - 获取/设置程序的调度优先级

内容简介

#include <sys/time.h>
#include <sys/resource.h>

int getpriority(int which, int who);
int setpriority(int which, int who, int prio);

描述

The scheduling priority of the process, process group, or user, as indicated by which andwho is obtained with the getpriority() call and set with the setpriority() call.

The value which is one of PRIO_PROCESS, PRIO_PGRP, or PRIO_USER, and who is interpreted relative to which (a process identifier for PRIO_PROCESS, process group identifier for PRIO_PGRP, and a user ID for PRIO_USER). A zero value for who denotes (respectively) the calling process, the process group of the calling process, or the real user ID of the calling process. Prio is a value in the range -20 to 19 (but see the Notes below). The default priority is 0; lower priorities cause more favorable scheduling.

The getpriority() call returns the highest priority (lowest numerical value) enjoyed by any of the specified processes. The setpriority() call sets the priorities of all of the specified processes to the specified value. Only the superuser may lower priorities.

返回值

Since getpriority() can legitimately return the value -1, it is necessary to clear the external variable errno prior to the call, then check it afterwards to determine if a -1 is an error or a legitimate value. The setpriority() call returns 0 if there is no error, or -1 if there is.

错误

标签	描述
EINVAL	which was not one of PRIO_PROCESS, PRIO_PGRP, orPRIO_USER.
ESRCH	No process was located using the which and who values specified.
In addition to the errors indicated above, setpriority() may fail if:
EPERM	A process was located, but its effective user ID did not match either the effective or the real user ID of the caller, and was not privileged (on Linux: did not have the CAP_SYS_NICEcapability). But see NOTES below.
EACCES	The caller attempted to lower a process priority, but did not have the required privilege (on Linux: did not have theCAP_SYS_NICE capability). Since Linux 2.6.12, this error only occurs if the caller attempts to set a process priority outside the range of the RLIMIT_NICE soft resource limit of the target process; see getrlimit(2) for details.

注意

A child created by fork(2) inherits its parent’s nice value. The nice value is preserved across execve(2).

The details on the condition for EPERM depend on the system. The above description is what POSIX.1-2001 says, and seems to be followed on all System V-like systems. Linux kernels before 2.6.12 required the real or effective user ID of the caller to match the real user of the process who (instead of its effective user ID). Linux 2.6.12 and later require the effective user ID of the caller to match the real or effective user ID of the process who. All BSD-like systems (SunOS 4.1.3, Ultrix 4.2, 4.3BSD, FreeBSD 4.3, OpenBSD-2.5, ...) behave in the same manner as Linux >= 2.6.12.

The actual priority range varies between kernel versions. Linux before 1.3.36 had -infinity..15. Since kernel 1.3.43 Linux has the range -20..19. Within the kernel, nice values are actually represented using the corresponding range 40..1 (since negative numbers are error codes) and these are the values employed by the setpriority() andgetpriority() system calls. The glibc wrapper functions for these system calls handle the translations between the user-land and kernel representations of the nice value according to the formula unice = 20 - knice.

On some systems, the range of nice values is -20..20.

Including <sys/time.h> is not required these days, but increases portability. (Indeed,<sys/resource.h> defines the rusage structure with fields of type struct timeval defined in <sys/time.h>.)

遵循于

SVr4, 4.4BSD (these function calls first appeared in 4.2BSD), POSIX.1-2001.

另请参阅

getresuid()函数

getresuid, getresgid - 获得真正的，有效的和保存的用户或组ID

内容简介

#define _GNU_SOURCE
#include <unistd.h>

int getresuid(uid_t *ruid, uid_t *euid, uid_t *suid);
int getresgid(gid_t *rgid, gid_t *egid, gid_t *sgid);

描述

getresuid() and getresgid() (both introduced in Linux 2.1.44) get the real UID, effective UID, and saved set-user-ID (resp. group ID’s) of the current process.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EFAULT	One of the arguments specified an address outside the calling program’s address space.

遵循于

These calls are non-standard; they also appear on HP-UX and some of the BSDs.

The prototype is given by glibc since version 2.3.2 provided _GNU_SOURCE is defined.

另请参阅

getrlimit()函数

getrlimit, setrlimit - 获取/设置资源限制

内容简介

#include <sys/time.h>
#include <sys/resource.h>

int getrlimit(int resource, struct rlimit *rlim);
int setrlimit(int resource, const struct rlimit *rlim);

描述

getrlimit() 和setrlimit() 获取和分别设置资源限制。每个资源都有一个相关的软，硬限制，由rlimit 结构（rlim 参数两者之定义 getrlimit() 和 setrlimit()):

struct rlimit {
rlim_t rlim_cur; /* Soft limit */
rlim_t rlim_max; /* Hard limit (ceiling for rlim_cur) */
};

The soft limit is the value that the kernel enforces for the corresponding resource. The hard limit acts as a ceiling for the soft limit: an unprivileged process may only set its soft limit to a value in the range from 0 up to the hard limit, and (irreversibly) lower its hard limit. A privileged process (under Linux: one with the CAP_SYS_RESOURCE capability) may make arbitrary changes to either limit value.

The value RLIM_INFINITY denotes no limit on a resource (both in the structure returned by getrlimit() and in the structure passed to setrlimit()).

resource must be one of:

标签

描述

RLIMIT_AS

The maximum size of the process’s virtual memory (address space) in bytes. This limit affects calls to brk(2), mmap(2) andmremap(2), which fail with the error ENOMEM upon exceeding this limit. Also automatic stack expansion will fail (and generate a SIGSEGV that kills the process if no alternate stack has been made available via sigaltstack(2)). Since the value is a long, on machines with a 32-bit long either this limit is at most 2 GiB, or this resource is unlimited.

RLIMIT_CORE

Maximum size of core file. When 0 no core dump files are created. When non-zero, larger dumps are truncated to this size.

RLIMIT_CPU

CPU time limit in seconds. When the process reaches the soft limit, it is sent a SIGXCPU signal. The default action for this signal is to terminate the process. However, the signal can be caught, and the handler can return control to the main program. If the process continues to consume CPU time, it will be sentSIGXCPU once per second until the hard limit is reached, at which time it is sent SIGKILL. (This latter point describes Linux 2.2 through 2.6 behaviour. Implementations vary in how they treat processes which continue to consume CPU time after reaching the soft limit. Portable applications that need to catch this signal should perform an orderly termination upon first receipt of SIGXCPU.)

RLIMIT_DATA

The maximum size of the process’s data segment (initialized data, uninitialized data, and heap). This limit affects calls tobrk() and sbrk(), which fail with the error ENOMEM upon encountering the soft limit of this resource.

RLIMIT_FSIZE

The maximum size of files that the process may create. Attempts to extend a file beyond this limit result in delivery of a SIGXFSZsignal. By default, this signal terminates a process, but a process can catch this signal instead, in which case the relevant system call (e.g., write() truncate()) fails with the error EFBIG.

RLIMIT_LOCKS (Early Linux 2.4 only)

A limit on the combined number of flock() locks and fcntl() leases that this process may establish.

RLIMIT_MEMLOCK

The maximum number of bytes of memory that may be locked into RAM. In effect this limit is rounded down to the nearest multiple of the system page size. This limit affects mlock(2) andmlockall(2) and the mmap(2) MAP_LOCKED operation. Since Linux 2.6.9 it also affects the shmctl(2) SHM_LOCK operation, where it sets a maximum on the total bytes in shared memory segments (see shmget(2)) that may be locked by the real user ID of the calling process. The shmctl(2) SHM_LOCK locks are accounted for separately from the per-process memory locks established by mlock(2), mlockall(2), and mmap(2)MAP_LOCKED; a process can lock bytes up to this limit in each of these two categories. In Linux kernels before 2.6.9, this limit controlled the amount of memory that could be locked by a privileged process. Since Linux 2.6.9, no limits are placed on the amount of memory that a privileged process may lock, and this limit instead governs the amount of memory that an unprivileged process may lock.

RLIMIT_MSGQUEUE (Since Linux 2.6.8)

Specifies the limit on the number of bytes that can be allocated for POSIX message queues for the real user ID of the calling process. This limit is enforced for mq_open(3). Each message queue that the user creates counts (until it is removed) against this limit according to the formula:

bytes = attr.mq_maxmsg * sizeof(struct msg_msg *) + attr.mq_maxmsg * attr.mq_msgsize

where attr is the mq_attr structure specified as the fourth argument to mq_open().

The first addend in the formula, which includes sizeof(struct msg_msg *) (4 bytes on Linux/x86), ensures that the user cannot create an unlimited number of zero-length messages (such messages nevertheless each consume some system memory for bookkeeping overhead).

RLIMIT_NICE (since kernel 2.6.12, but see BUGS below)

Specifies a ceiling to which the process’s nice value can be raised using setpriority(2) or nice(2). The actual ceiling for the nice value is calculated as 20 - rlim_cur. (This strangeness occurs because negative numbers cannot be specified as resource limit values, since they typically have special meanings. For example, RLIM_INFINITY typically is the same as -1.)

RLIMIT_NOFILE

Specifies a value one greater than the maximum file descriptor number that can be opened by this process. Attempts (open(),pipe(), dup(), etc.) to exceed this limit yield the error EMFILE.

RLIMIT_NPROC

The maximum number of threads that can be created for the real user ID of the calling process. Upon encountering this limit,fork() fails with the error EAGAIN.

RLIMIT_RSS

Specifies the limit (in pages) of the process’s resident set (the number of virtual pages resident in RAM). This limit only has effect in Linux 2.4.x, x < 30, and there only affects calls tomadvise() specifying MADV_WILLNEED.

RLIMIT_RTPRIO (Since Linux 2.6.12, but see BUGS)

Specifies a ceiling on the real-time priority that may be set for this process using sched_setscheduler(2) andsched_setparam(2).

RLIMIT_SIGPENDING (Since Linux 2.6.8)

Specifies the limit on the number of signals that may be queued for the real user ID of the calling process. Both standard and real-time signals are counted for the purpose of checking this limit. However, the limit is only enforced for sigqueue(2); it is always possible to use kill(2) to queue one instance of any of the signals that are not already queued to the process.

RLIMIT_STACK

The maximum size of the process stack, in bytes. Upon reaching this limit, a SIGSEGV signal is generated. To handle this signal, a process must employ an alternate signal stack (sigaltstack(2)).

RLIMIT_OFILE is the BSD name for RLIMIT_NOFILE.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EFAULT	rlim points outside the accessible address space.
EINVAL	resource is not valid; or, for setrlimit(): rlim->rlim_cur was greater than rlim->rlim_max.
EPERM	An unprivileged process tried to use setrlimit() to increase a soft or hard limit above the current hard limit; theCAP_SYS_RESOURCE capability is required to do this. Or, the process tried to use setrlimit() to increase the soft or hard RLIMIT_NOFILE limit above the current kernel maximum (NR_OPEN).

BUGS

In older Linux kernels, the SIGXCPU and SIGKILL signals delivered when a process encountered the soft and hard RLIMIT_CPU limits were delivered one (CPU) second later than they should have been. This was fixed in kernel 2.6.8.

In 2.6.x kernels before 2.6.17, a RLIMIT_CPU limit of 0 is wrongly treated as "no limit" (like RLIM_INFINITY). Since kernel 2.6.17, setting a limit of 0 does have an effect, but is actually treated as a limit of 1 second.

A kernel bug means that RLIMIT_RTPRIO does not work in kernel 2.6.12; the problem is fixed in kernel 2.6.13.

In kernel 2.6.12, there was an off-by-one mismatch between the priority ranges returned by getpriority(2) and RLIMIT_NICE. This had the effect that actual ceiling for the nice value was calculated as 19 - rlim_cur. This was fixed in kernel 2.6.13.

Kernels before 2.4.22 did not diagnose the error EINVAL for setrlimit() when rlim->rlim_cur was greater than rlim->rlim_max.

注意

A child process created via fork(2) inherits its parents resource limits. Resource limits are preserved across execve(2).

遵循于

SVr4, 4.3BSD, POSIX.1-2001. RLIMIT_MEMLOCK and RLIMIT_NPROC derive from BSD and are not specified in POSIX.1-2001; they are present on the BSDs and Linux, but on few other implementations. RLIMIT_RSS derives from BSD and is not specified in POSIX.1-2001; it is nevertheless present on most implementations.RLIMIT_MSGQUEUE, RLIMIT_NICE, RLIMIT_RTPRIO, and RLIMIT_SIGPENDING are Linux specific.

另请参阅

get_robust_list()函数

get_robust_list, set_robust_list - 获取/设置强健futexes的清单

内容简介

#include <linux/futex.h>

#include <syscall.h>

long get_robust_list(int pid, struct robust_list_head **head_ptr, size_t * long set_robust_list(struct robust_list_head *head, size_t len);

描述

The robust futex implementation needs to maintain per-thread lists of robust futexes which are unlocked when the thread exits. These lists are managed in user space, the kernel is only notified about the location of the head of the list.

get_robust_list returns the head of the robust futex list of the thread with TID defined by the pid argument. If pid is 0, the returned head belongs to the current thread.head_ptr is the pointer to the head of the list of robust futexes. The get_robust_listfunction stores the address of the head of the list here. len_ptr is the pointer to the length variable. get_robust_list stores sizeof(**head_ptr) here.

set_robust_list sets the head of the list of robust futexes owned by the current thread to head. len is the size of *head.

返回值

The set_robust_list and get_robust_list functions return zero when the operation is successful, an error code otherwise.

错误

The set_robust_list function fails with EINVAL if the len value does not match the size of structure struct robust_list_head expected by kernel.

The get_robust_list function fails with EPERM if the current process does not have permission to see the robust futex list of the thread with the TID pid, ESRCH if a thread with the TID pid does not exist, or EFAULT if the head of the robust futex list can’t be stored in the space specified by the head argument.

实际应用信息

一个线程只能有一个强大的 futex 清单，因此希望使用该功能的应用程序应该使用的glibc提供强大的互斥体。

系统调用是唯一可用于调试目的，不正常操作所需的。

这两个系统调用是不提供给应用程序的功能，他们可以使用 syscall(3)函数被调用。

另请参阅

futex (2)

getrusage()函数

getrusage - 得到的资源使用情况

内容简介

#include <sys/time.h>
#include <sys/resource.h>

int getrusage(int who, struct rusage *usage);

描述

getrusage() 返回当前资源使用，对于无论是 RUSAGE_SELF 或 RUSAGE_CHILDREN. 前者要求所使用当前进程，后者所使用的那些其子已经终止，并且已经在等待资源的资源。

struct rusage {
struct timeval ru_utime; /* user time used */
struct timeval ru_stime; /* system time used */
long ru_maxrss; /* maximum resident set size */
long ru_ixrss; /* integral shared memory size */
long ru_idrss; /* integral unshared data size */
long ru_isrss; /* integral unshared stack size */
long ru_minflt; /* page reclaims */
long ru_majflt; /* page faults */
long ru_nswap; /* swaps */
long ru_inblock; /* block input operations */
long ru_oublock; /* block output operations */
long ru_msgsnd; /* messages sent */
long ru_msgrcv; /* messages received */
long ru_nsignals; /* signals received */
long ru_nvcsw; /* voluntary context switches */
long ru_nivcsw; /* involuntary context switches */
};

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EFAULT	usage points outside the accessible address space.
EINVAL	who is invalid.

遵循于

SVr4, 4.3BSD. POSIX.1-2001 specifies getrusage(), but only specifies the fields ru_utimeand ru_stime.

注意

Including <sys/time.h> is not required these days, but increases portability. (Indeed,struct timeval is defined in <sys/time.h>.)

In Linux kernel versions before 2.6.9, if the disposition of SIGCHLD is set to SIG_IGNthen the resource usages of child processes are automatically included in the value returned by RUSAGE_CHILDREN, although POSIX.1-2001 explicitly prohibits this. This non-conformance is rectified in Linux 2.6.9 and later.

The above struct was taken from 4.3BSD Reno. Not all fields are meaningful under Linux. In linux 2.4 only the fields ru_utime, ru_stime, ru_minflt, and ru_majflt are maintained. Since Linux 2.6, ru_nvcsw and ru_nivcsw are also maintained.

另请参阅

getsid()函数

getsid - 获取会话ID

内容简介

#include <unistd.h>

pid_t getsid(pid_t pid);

描述

getsid(0) 返回调用进程的会话ID. getsid(p) 返回与进程ID的进程的会话ID p. (一个进程的会话ID是会话组长的进程组ID.) On error, (pid_t) -1 will be returned, and errno is set appropriately.

错误

标签	描述
EPERM	A process with process ID p exists, but it is not in the same session as the current process, and the implementation considers this an error.
ESRCH	No process with process ID p was found.

遵循于

SVr4, POSIX.1-2001.

注意

Linux does not return EPERM.

Linux has this system call since Linux 1.3.44. There is libc support since libc 5.2.19.

To get the prototype under glibc, define both _XOPEN_SOURCE and _XOPEN_SOURCE_EXTENDED, or use "#define _XOPEN_SOURCE n" for some integer nlarger than or equal to 500.

另请参阅

getsockname()函数

getsockname - 获得套接字名称

内容简介

#include <sys/socket.h>

int getsockname(int s, struct sockaddr *name, socklen_t *namelen);

描述

getsockname() 返回当前名称指定的套接字。namelen 参数应被初始化，以指示的空间指向量的名字。返回时，包含名称的实际大小（以字节为单位）.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EBADF	The argument s is not a valid descriptor.
EFAULT	The name parameter points to memory not in a valid part of the process address space.
EINVAL	namelen is invalid (e.g., is negative).
ENOBUFS
	Insufficient resources were available in the system to perform the operation.
ENOTSOCK
	The argument s is a file, not a socket.

遵循于

SVr4, 4.4BSD (the getsockname() function call appeared in 4.2BSD), POSIX.1-2001.

注意

The third argument of getsockname() is in reality an ‘int *’ (and this is what 4.x BSD and libc4 and libc5 have). Some POSIX confusion resulted in the present socklen_t, also used by glibc. See also accept(2).

另请参阅

getsockopt()函数

getsockopt, setsockopt - 获取和设置套接字选项

内容简介

#include <sys/types.h>
#include <sys/socket.h>

int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen);

int setsockopt(int s, int level, int optname, const void *optval, socklen_t optlen);

描述

getsockopt() and setsockopt() manipulate the options associated with a socket. Options may exist at multiple protocol levels; they are always present at the uppermost socket level.

When manipulating socket options the level at which the option resides and the name of the option must be specified. To manipulate options at the socket level, level is specified as SOL_SOCKET. To manipulate options at any other level the protocol number of the appropriate protocol controlling the option is supplied. For example, to indicate that an option is to be interpreted by the TCP protocol, level should be set to the protocol number of TCP; see getprotoent(3).

The parameters optval and optlen are used to access option values for setsockopt(). Forgetsockopt() they identify a buffer in which the value for the requested option(s) are to be returned. For getsockopt(), optlen is a value-result parameter, initially containing the size of the buffer yiibaied to by optval, and modified on return to indicate the actual size of the value returned. If no option value is to be supplied or returned, optval may be NULL.

Optname and any specified options are passed uninterpreted to the appropriate protocol module for interpretation. The include file <sys/socket.h> contains definitions for socket level options, described below. Options at other protocol levels vary in format and name; consult the appropriate entries in section 4 of the manual.

Most socket-level options utilize an int parameter for optval. For setsockopt(), the parameter should be non-zero to enable a boolean option, or zero if the option is to be disabled.

For a description of the available socket options see socket(7) and the appropriate protocol man pages.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EBADF	The argument s is not a valid descriptor.
EFAULT	The address yiibaied to by optval is not in a valid part of the process address space. For getsockopt(), this error may also be returned if optlen is not in a valid part of the process address space.
EINVAL	optlen invalid in setsockopt().
ENOPROTOOPT
	The option is unknown at the level indicated.
ENOTSOCK	The argument s is a file, not a socket.

遵循于

SVr4, 4.4BSD (these system calls first appeared in 4.2BSD),
POSIX.1-2001.

注意

The optlen argument of getsockopt and setsockopt is in reality an int [*] (and this is what 4.x BSD and libc4 and libc5 have). Some POSIX confusion resulted in the presentsocklen_t, also used by glibc. See also accept(2).

BUGS

Several of the socket options should be handled at lower levels of the system.

另请参阅

get_thread_area()函数

get_thread_area - 获取一个线程本地存储（TLS）区

内容简介

#include <linux/unistd.h>
#include <asm/ldt.h>

int get_thread_area(struct user_desc *u_info);

描述

get_thread_area() returns an entry in the current thread’s Thread Local Storage (TLS) array. The index of the entry corresponds to the value of u_info->entry_number, passed in by the user. If the value is in bounds, get_thread_info copies the corresponding TLS entry into the area yiibaied to by u_info.

返回值

get_thread_area() returns 0 on success. Otherwise, it returns -1 and sets errno appropriately.

错误

标签	描述
EFAULT	u_info is an invalid yiibaier.
EINVAL	u_info->entry_number is out of bounds.

遵循于

get_thread_area() 是Linux特有的，并应在该旨在是可移植的程序不被使用。

AVAILABILITY

A version of get_thread_area() first appeared in Linux 2.5.32.

另请参阅

gettid()函数

gettid - 获取线程标识

内容简介

#include <sys/types.h>

pid_t gettid(void);

描述

gettid() returns the thread ID of the current process. This is equal to the process ID (as returned by getpid(2)), unless the process is part of a thread group (created by specifying the CLONE_THREAD flag to the clone(2) system call). All processes in the same thread group have the same PID, but each one has a unique TID.

返回值

如果成功，返回当前进程的线程ID。

错误

这个调用永远是成功的。

遵循于

gettid() 是Linux特有的，并应在该旨在是可移植的程序不被使用。

注意

Glibc does not provide a wrapper for this system call; call it using syscall(2).

另请参阅

gettimeofday()函数

gettimeofday, settimeofday - 获取/设置时间

内容简介

#include <sys/time.h>

int gettimeofday(struct timeval *tv, struct timezone *tz);
int settimeofday(const struct timeval *tv , const struct timezone *tz);

描述

The functions gettimeofday() and settimeofday() can get and set the time as well as a timezone. The tv argument is a struct timeval (as specified in <sys/time.h>):

struct timeval {
time_t tv_sec; /* seconds */
suseconds_t tv_usec; /* microseconds */
};

and gives the number of seconds and microseconds since the Epoch (see time(2)). Thetz argument is a struct timezone:

struct timezone {
int tz_minuteswest; /* minutes west of Greenwich */
int tz_dsttime; /* type of DST correction */
};

If either tv or tz is NULL, the corresponding structure is not set or returned.

The use of the timezone structure is obsolete; the tz argument should normally be specified as NULL. The tz_dsttime field has never been used under Linux; it has not been and will not be supported by libc or glibc. Each and every occurrence of this field in the kernel source (other than the declaration) is a bug. Thus, the following is purely of historic interest.

The field tz_dsttime contains a symbolic constant (values are given below) that indicates in which part of the year Daylight Saving Time is in force. (Note: its value is constant throughout the year: it does not indicate that DST is in force, it just selects an algorithm.) The daylight saving time algorithms defined are as follows :

DST_NONE /* not on dst */
DST_USA /* USA style dst */
DST_AUST /* Australian style dst */
DST_WET /* Western European dst */
DST_MET /* Middle European dst */
DST_EET /* Eastern European dst */
DST_CAN /* Canada */
DST_GB /* Great Britain and Eire */
DST_RUM /* Rumania */
DST_TUR /* Turkey */
DST_AUSTALT /* Australian style with shift in 1986 */

Of course it turned out that the period in which Daylight Saving Time is in force cannot be given by a simple algorithm, one per country; indeed, this period is determined by unpredictable political decisions. So this method of representing time zones has been abandoned. Under Linux, in a call to settimeofday() the tz_dsttime field should be zero.

Under Linux there is some peculiar ‘warp clock’ semantics associated to thesettimeofday() system call if on the very first call (after booting) that has a non-NULL tzargument, the tv argument is NULL and the tz_minuteswest field is non-zero. In such a case it is assumed that the CMOS clock is on local time, and that it has to be incremented by this amount to get UTC system time. No doubt it is a bad idea to use this feature.

下面的宏定义在一个struct timeval操作：

#define timerisset(tvp)\
((tvp)->tv_sec || (tvp)->tv_usec)
#define timercmp(tvp, uvp, cmp)\

((tvp)->tv_sec cmp (uvp)->tv_sec ||\
(tvp)->tv_sec == (uvp)->tv_sec &&\
(tvp)->tv_usec cmp (uvp)->tv_usec)

#define timerclear(tvp)\
((tvp)->tv_sec = (tvp)->tv_usec = 0)

返回值

gettimeofday() and settimeofday() return 0 for success, or -1 for failure (in which caseerrno is set appropriately).

错误

标签	描述
EFAULT	One of tv or tz pointed outside the accessible address space.
EINVAL	Timezone (or something else) is invalid.
EPERM	The calling process has insufficient privilege to callsettimeofday(); under Linux the CAP_SYS_TIME capability is required.

注意

The prototype for settimeofday() and the defines for timercmp, timerisset, timerclear,timeradd, timersub are (since glibc2.2.2) only available if _BSD_SOURCE is defined.

Traditionally, the fields of struct timeval were longs.

遵循于

SVr4, 4.3BSD. POSIX.1-2001 describes gettimeofday() but not settimeofday().

另请参阅

getuid()函数

getuid, geteuid -获取用户标识

内容简介

#include <unistd.h>
#include <sys/types.h>

uid_t getuid(void);
uid_t geteuid(void);

描述

getuid() 返回当前进程的真实用户ID。

geteuid() 返回当前进程的有效用户ID。

ERRORS

这些函数总是成功的。

CONFORMING TO

POSIX.1-2001, 4.3BSD.

HISTORY

In Unix V6 the getuid() call returned (euid << 8) + uid. Unix V7 introduced separate callsgetuid() and geteuid().

getunwind()函数

getunwind - 放卷数据复制到调用方的缓冲区

内容简介

#include <syscall.h>

#include <linux/unwind.h>

long getunwind (void *buf, size_t buf_size);

描述

The sys_getunwind function returns size of unwind table, which describes gate page (kernel code that is mapped into user space).

The unwind data is copied to the buffer buf, which has size buf_size. The data is copied only if buf_size is greater than or equal to the size of the unwind data and buf is not NULL. The system call returns the size of the unwind data in both cases.

The first part of the unwind data contains an unwind table. The rest contains the associated unwind info in random order. The unwind table contains a table looking like:

        u64 start; (64-bit address of start of function)
        u64 end; (64-bit address of start of function)
        u64 info; (BUF-relative offset to unwind info)

An entry with a START address of zero is the end of table. For more information about the format you can see the IA-64 Software Conventions and Runtime Architecture.

返回值

sys_getunwind 函数返回展开表的大小。

错误

The sys_getunwind function fails with EFAULT if the unwind info can’t be stored in the space specified by the buf argument.

可用性

这个系统调用是仅适用于IA-64架构。

实际应用信息

This system call has been deprecated. It’s highly recommended to get at the kernel’s unwind info by the gate DSO. The address of the ELF header for this DSO is passed to user level via AT_SYSINFO_EHDR.

The system call is not available to application programs as a function; it can be called using the syscall(2) function.

另请参阅

syscall (2)

gtty()函数

afs_syscall, break, fattach, fdetach, ftime, getmsg, getpmsg, gtty, isastream, lock, mpx, multiplexer, prof, profil, putmsg, putpmsg, security, stty, ulimit, vserver -未实现系统调用

内容简介

未实现系统调用

描述

These system calls are not implemented in the Linux 2.4 kernel.

返回值

These system calls always return -1 and set errno to ENOSYS.

注意

Note that ftime(3), profil(3) and ulimit(3) are implemented as library functions.

Some system calls, like alloc_hugepages(2), free_hugepages(2), ioperm(2), iopl(2), and vm86(2) only exist on certain architectures.

Some system calls, like ipc(2), create_module(2), init_module(2), anddelete_module(2) only exist when the Linux kernel was built with support for them.

另请参阅

obsolete (2)

idle()函数

idle - 使进程 0 空闲

内容简介

#include <unistd.h>

int idle(void);

描述

idle() is an internal system call used during bootstrap. It marks the process’s pages as swappable, lowers its priority, and enters the main scheduling loop. idle() never returns.

Only process 0 may call idle(). Any user process, even a process with superuser permission, will receive EPERM.

返回值

idle() never returns for process 0, and always returns -1 for a user process.

错误

标签	描述
EPERM	Always, for a user process.

遵循于

This function is Linux-specific, and should not be used in programs intended to be portable.

注意

Since 2.3.13 this system call does not exist anymore.

outb()函数

outb, outw, outl, outsb, outsw, outsl, inb, inw, inl, insb, insw, insl, outb_p, outw_p, outl_p, inb_p, inw_p, inl_p - 端口I / O

inotify_add_watch()函数

inotify_add_watch - 添加监视到一个初始化的inotify实例

内容简介

#include <sys/inotify.h>

int inotify_add_watch(int fd, const char *pathname, uint32_t mask);

描述

inotify_add_watch() adds a new watch, or modifies an existing watch, for the file whose location is specified in pathname; the caller must have read permission for this file. The fd argument is a file descriptor referring to the inotify instance whose watch list is to be modified. The events to be monitored for pathname are specified in the maskbit-mask argument. See inotify(7) for a description of the bits that can be set in mask.

A successful call to inotify_add_watch() returns the unique watch descriptor associated with pathname for this inotify instance. If pathname was not previously being watched by this inotify instance, then the watch descriptor is newly allocated. If pathname was already being watched, then the descriptor for the existing watch is returned.

The watch descriptor is returned by later read(2)s from the inotify file descriptor. These reads fetch inotify_event structures indicating file system events; the returned watch descriptor identifies the object for which the event occurred.

返回值

On success, inotify_add_watch() returns a non-negative watch descriptor. On error -1 is returned and errno is set appropriately.

错误

标签	描述
EACCESS
	Read access to the given file is not permitted.
EBADF	The given file descriptor is not valid.
EFAULT	pathname yiibais outside of the process’s accessible address space.
EINVAL	The given event mask contains no legal events; or fd is not an inotify file descriptor.
ENOMEM	Insufficient kernel memory was available.
ENOSPC	The user limit on the total number of inotify watches was reached or the kernel failed to allocate a needed resource.

历史

Inotify was merged into the 2.6.13 Linux kernel.

遵循于

This system call is Linux specific.

另请参阅

inotify_init()函数

inotify_init - 初始化一个inotify实例

内容简介

#include <sys/inotify.h>

int inotify_init(void)

描述

inotify_init() 初始化一个新的inotify实例，并返回一个新的inotify的事件队列相关的文件描述符。

返回值

On success, inotify_init() returns a new file descriptor, or -1 if an error occurred (in which case, errno is set appropriately).

错误

标签	描述
EMFILE	The user limit on the total number of inotify instances has been reached.
ENFILE	The system limit on the total number of file descriptors has been reached.
ENOMEM	Insufficient kernel memory is available.

历史

Inotify was merged into the 2.6.13 Linux kernel.

遵循于

This system call is Linux specific.

另请参阅

inotify_rm_watch()函数

inotify_rm_watch - 从inotify实例移除现有的监视

内容简介

#include <sys/inotify.h>

int inotify_rm_watch(int fd, uint32_t wd);

描述

inotify_rm_watch() 删除与从与文件描述符 fd 相关的 inotify 实例的描述符表关联的 wd 监视 .

Removing a watch causes an IN_IGNORED event to be generated for this watch descriptor. (See inotify(7).)

返回值

On success, inotify_rm_watch() returns zero, or -1 if an error occurred (in which case,errno is set appropriately).

错误

标签	描述
EBADF	fd is not a valid file descriptor.
EINVAL	The watch descriptor wd is not valid; or fd is not an inotify file descriptor.

本手册的第二部分描述了Linux的系统调用。系统调用是一个入口点到Linux内核中。通常情况下，系统调用不直接调用：相反，大多数系统调用都有相应的C库函数的包装而执行所需的步骤（例如，捕获到内核模式），以便调用系统调用。因此，做一个系统调用看起来一样调用一个正常的库函数。

对于Linux系统调用列表，请参阅 syscalls(2).

返回值

On error, most system calls return a negative error number (i.e., the negated value of one of the constants described in errno(3)). The C library wrapper hides this detail from the caller: when a system call returns a negative value, the wrapper copies the absolute value into the errno variable, and returns -1 as the return value of the wrapper.

一个成功的系统调用返回的值取决于调用。许多系统调用返回0表示成功，但有些可以从一个成功的调用返回非零值。详情载于个别的手册页描述。

In some cases, the programmer must define a feature test macro in order to obtain the declaration of a system call from the header file specified in the man page SYNOPSIS section. In such cases, the required macro is described in the man page. For further information on feature test macros, see feature_test_macros(7).

遵循于

某些术语和缩写用于指示的Unix变体和标准在本节所谓符合。看 standards(7).

注意

直接调用

In most cases, it is unnecessary to invoke a system call directly, but there are times when the Standard C library does not implement a nice wrapper function for you. In this case, the programmer must manually invoke the system call using syscall(2). Historically, this was also possible using one of the _syscall macros described in_syscall(2).

作者和版权条款

Look at the header of the manual page source for the author(s) and copyright conditions. Note that these can be different from page to page!

另请参阅

This page is part of release 3.00 of the Linux man-pages project. A description of the project, and information about reporting bugs, can be found at http://www.kernel.org/doc/man-pages/.

inw()函数

outb, outw, outl, outsb, outsw, outsl, inb, inw, inl, insb, insw, insl, outb_p, outw_p, outl_p, inb_p, inw_p, inl_p - 端口I/O

描述

They are primarily designed for internal kernel use, but can be used from user space.

You compile with -O or -O2 or similar. The functions are defined as inline macros, and will not be substituted in without optimization enabled, causing unresolved references at link time.

CONFORMING TO

outb() and friends are hardware specific. The value argument is passed first and the portargument is passed second, which is the opposite order from most DOS implementations.

inw_p()函数

outb, outw, outl, outsb, outsw, outsl, inb, inw, inl, insb, insw, insl, outb_p, outw_p, outl_p, inb_p, inw_p, inl_p - 端口 I/O

描述

They are primarily designed for internal kernel use, but can be used from user space.

You compile with -O or -O2 or similar. The functions are defined as inline macros, and will not be substituted in without optimization enabled, causing unresolved references at link time.

遵循于

outb() and friends are hardware specific. The value argument is passed first and the portargument is passed second, which is the opposite order from most DOS implementations.

另请参阅

io_cancel()函数

io_cancel - 取消未完成的异步I / O操作

内容简介

#include <libaio.h>

标签	描述
long io_cancel (aio_context_t ctx_id, struct iocb iocb, struct io_event result);

描述

io_cancel() attempts to cancel an asynchronous I/O operation previously submitted with the io_submit system call. ctx_id is the AIO context ID of the operation to be cancelled. If the AIO context is found, the event will be cancelled and then copied into the memory yiibaied to by result without being placed into the completion queue.

返回值

io_cancel() returns 0 on success; otherwise, it returns one of the errors listed in the "Errors" section.

错误

标签	描述
EINVAL	The AIO context specified by ctx_id is invalid.
EFAULT	One of the data structures yiibais to invalid data.
EAGAIN	The iocb specified was not cancelled.
ENOSYS	io_cancel() is not implemented on this architecture.

版本

The asynchronous I/O system calls first appeared in Linux 2.5, August 2002.

遵循于

io_cancel() is Linux specific and should not be used in programs that are intended to be portable.

另请参阅

io_setup(2), io_destroy(2), io_getevents(2), io_submit(2).

注意

The asynchronous I/O system calls were written by Benjamin LaHaise.

作者

Kent Yoder.

ioctl()函数

ioctl - 控制设备

内容简介

#include <sys/ioctl.h>

int ioctl(int d, int request, ...);

描述

The ioctl() function manipulates the underlying device parameters of special files. In particular, many operating characteristics of character special files (e.g. terminals) may be controlled with ioctl() requests. The argument d must be an open file descriptor.

The second argument is a device-dependent request code. The third argument is an untyped yiibaier to memory. It’s traditionally char *argp (from the days before void *was valid C), and will be so named for this discussion.

An ioctl() request has encoded in it whether the argument is an in parameter or outparameter, and the size of the argument argp in bytes. Macros and defines used in specifying an ioctl() request are located in the file <sys/ioctl.h>.

返回值

Usually, on success zero is returned. A few ioctl() requests use the return value as an output parameter and return a nonnegative value on success. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EBADF	d is not a valid descriptor.
EFAULT	argp references an inaccessible memory area.
EINVAL	Request or argp is not valid.
ENOTTY	d is not associated with a character special device.
ENOTTY	The specified request does not apply to the kind of object that the descriptor d references.

注意

In order to use this call, one needs an open file descriptor. Often the open(2) call has unwanted side effects, that can be avoided under Linux by giving it the O_NONBLOCK flag.

遵循于

No single standard. Arguments, returns, and semantics of ioctl(2) vary according to the device driver in question (the call is used as a catch-all for operations that don’t cleanly fit the Unix stream I/O model). See ioctl_list(2) for a list of many of the known ioctl() calls. The ioctl() function call appeared in Version 7 AT&T Unix.

另请参阅

ioctl_list()函数

ioctl_list - 在Linux/i386 中内核的ioctl调用列表

描述

This is Ioctl List 1.3.27, a list of ioctl calls in Linux/i386 kernel 1.3.27. It contains 421 ioctls from /usr/include/{asm,linux}/*.h. For each ioctl, its numerical value, its name, and its argument type are given.

An argument type of ’const struct foo *’ means the argument is input to the kernel. ’struct foo *’ means the kernel outputs the argument. If the kernel uses the argument for both input and output, this is marked with // I-O.

一些读写控制采取更多参数或返回超过一个单一的结构更多的值。这些标记//以上，进一步在一个单独的部分记录。

这个列表是非常不完整的。请电邮修订和批注，Mail: <mec@duracef.shout.net>.

IOCTL结构

ioctl命令的值是32位的常数。原则上这些常量是完全任意的，但人们都试图建立一些结构放进去。

The old Linux situation was that of mostly 16-bit constants, where the last byte is a serial number, and the preceding byte(s) give a type indicating the driver. Sometimes the major number was used: 0x03 for the HDIO_* ioctls, 0x06 for the LP* ioctls. And sometimes one or more ASCII letters were used. For example, TCGETS has value 0x00005401, with 0x54 = ’T’ indicating the terminal driver, and CYGETTIMEOUT has value 0x00435906, with 0x43 0x59 = ’C’ ’Y’ indicating the cyclades driver.

Later (0.98p5) some more information was built into the number. One has 2 direction bits (00: none, 01: write, 10: read, 11: read/write) followed by 14 size bits (giving the size of the argument), followed by an 8-bit type (collecting the ioctls in groups for a common purpose or a common driver), and an 8-bit serial number.

The macros describing this structure live in <asm/ioctl.h> and are _IO(type,nr) and {_IOR,_IOW,_IOWR}(type,nr,size). They use sizeof(size) so that size is a misnomer here: this third parameter is a data type.

Note that the size bits are very unreliable: in lots of cases they are wrong, either because of buggy macros using sizeof(sizeof(struct)), or because of legacy values.

Thus, it seems that the new structure only gave disadvantages: it does not help in checking, but it causes varying values for the various architectures.

返回值

Decent ioctls return 0 on success and -1 on error, while any output value is stored via the argument. However, quite a few ioctls in fact return an output value. This is not yet indicated below.

// Main table.

//

0x00008901 FIOSETOWN const int *
0x00008902 SIOCSPGRP const int *
0x00008903 FIOGETOWN int *
0x00008904 SIOCGPGRP int *
0x00008905 SIOCATMARK int *
0x00008906 SIOCGSTAMP timeval *

//

0x00005401 TCGETS struct termios *
0x00005402 TCSETS const struct termios *
0x00005403 TCSETSW const struct termios *
0x00005404 TCSETSF const struct termios *
0x00005405 TCGETA struct termio *
0x00005406 TCSETA const struct termio *
0x00005407 TCSETAW const struct termio *
0x00005408 TCSETAF const struct termio *
0x00005409 TCSBRK int
0x0000540A TCXONC int
0x0000540B TCFLSH int
0x0000540C TIOCEXCL void
0x0000540D TIOCNXCL void
0x0000540E TIOCSCTTY int
0x0000540F TIOCGPGRP pid_t *
0x00005410 TIOCSPGRP const pid_t *
0x00005411 TIOCOUTQ int *
0x00005412 TIOCSTI const char *
0x00005413 TIOCGWINSZ struct winsize *
0x00005414 TIOCSWINSZ const struct winsize *
0x00005415 TIOCMGET int *
0x00005416 TIOCMBIS const int *
0x00005417 TIOCMBIC const int *
0x00005418 TIOCMSET const int *
0x00005419 TIOCGSOFTCAR int *
0x0000541A TIOCSSOFTCAR const int *
0x0000541B FIONREAD int *
0x0000541B TIOCINQ int *
0x0000541C TIOCLINUX const char * // MORE
0x0000541D TIOCCONS void
0x0000541E TIOCGSERIAL struct serial_struct *
0x0000541F TIOCSSERIAL const struct serial_struct *
0x00005420 TIOCPKT const int *
0x00005421 FIONBIO const int *
0x00005422 TIOCNOTTY void
0x00005423 TIOCSETD const int *
0x00005424 TIOCGETD int *
0x00005425 TCSBRKP int
0x00005426 TIOCTTYGSTRUCT struct tty_struct *
0x00005450 FIONCLEX void
0x00005451 FIOCLEX void
0x00005452 FIOASYNC const int *
0x00005453 TIOCSERCONFIG void
0x00005454 TIOCSERGWILD int *
0x00005455 TIOCSERSWILD const int *
0x00005456 TIOCGLCKTRMIOS struct termios *
0x00005457 TIOCSLCKTRMIOS const struct termios *
0x00005458 TIOCSERGSTRUCT struct async_struct *
0x00005459 TIOCSERGETLSR int *
0x0000545A TIOCSERGETMULTI struct serial_multiport_struct *
0x0000545B TIOCSERSETMULTI const struct serial_multiport_struct *

//

0x000089E0 SIOCAX25GETUID const struct sockaddr_ax25 *
0x000089E1 SIOCAX25ADDUID const struct sockaddr_ax25 *
0x000089E2 SIOCAX25DELUID const struct sockaddr_ax25 *
0x000089E3 SIOCAX25NOUID const int *
0x000089E4 SIOCAX25DIGCTL const int *
0x000089E5 SIOCAX25GETPARMS struct ax25_parms_struct * // I-O
0x000089E6 SIOCAX25SETPARMS const struct ax25_parms-struct *

//

0x00007314 STL_BINTR void
0x00007315 STL_BSTART void
0x00007316 STL_BSTOP void
0x00007317 STL_BRESET void

//

0x00005301 CDROMPAUSE void
0x00005302 CDROMRESUME void
0x00005303 CDROMPLAYMSF const struct cdrom_msf *
0x00005304 CDROMPLAYTRKIND const struct cdrom_ti *
0x00005305 CDROMREADTOCHDR struct cdrom_tochdr *
0x00005306 CDROMREADTOCENTRY struct cdrom_tocentry * // I-O
0x00005307 CDROMSTOP void
0x00005308 CDROMSTART void
0x00005309 CDROMEJECT void
0x0000530A CDROMVOLCTRL const struct cdrom_volctrl *
0x0000530B CDROMSUBCHNL struct cdrom_subchnl * // I-O
0x0000530C CDROMREADMODE2 const struct cdrom_msf * // MORE
0x0000530D CDROMREADMODE1 const struct cdrom_msf * // MORE
0x0000530E CDROMREADAUDIO const struct cdrom_read_audio * // MORE
0x0000530F CDROMEJECT_SW int
0x00005310 CDROMMULTISESSION struct cdrom_multisession * // I-O
0x00005311 CDROM_GET_UPC struct { char [8]; } *
0x00005312 CDROMRESET void
0x00005313 CDROMVOLREAD struct cdrom_volctrl *
0x00005314 CDROMREADRAW const struct cdrom_msf * // MORE
0x00005315 CDROMREADCOOKED const struct cdrom_msf * // MORE
0x00005316 CDROMSEEK const struct cdrom_msf *

//

0x00002000 CM206CTL_GET_STAT int
0x00002001 CM206CTL_GET_LAST_STAT int

//

0x00435901 CYGETMON struct cyclades_monitor *
0x00435902 CYGETTHRESH int *
0x00435903 CYSETTHRESH int
0x00435904 CYGETDEFTHRESH int *
0x00435905 CYSETDEFTHRESH int
0x00435906 CYGETTIMEOUT int *
0x00435907 CYSETTIMEOUT int
0x00435908 CYGETDEFTIMEOUT int *
0x00435909 CYSETDEFTIMEOUT int

//

0x80046601 EXT2_IOC_GETFLAGS int *
0x40046602 EXT2_IOC_SETFLAGS const int *
0x80047601 EXT2_IOC_GETVERSION int *
0x40047602 EXT2_IOC_SETVERSION const int *

//

0x00000000 FDCLRPRM void
0x00000001 FDSETPRM const struct floppy_struct *
0x00000002 FDDEFPRM const struct floppy_struct *
0x00000003 FDGETPRM struct floppy_struct *
0x00000004 FDMSGON void
0x00000005 FDMSGOFF void
0x00000006 FDFMTBEG void
0x00000007 FDFMTTRK const struct format_descr *
0x00000008 FDFMTEND void
0x0000000A FDSETEMSGTRESH int
0x0000000B FDFLUSH void
0x0000000C FDSETMAXERRS const struct floppy_max_errors *
0x0000000E FDGETMAXERRS struct floppy_max_errors *
0x00000010 FDGETDRVTYP struct { char [16]; } *
0x00000014 FDSETDRVPRM const struct floppy_drive_params *
0x00000015 FDGETDRVPRM struct floppy_drive_params *
0x00000016 FDGETDRVSTAT struct floppy_drive_struct *
0x00000017 FDPOLLDRVSTAT struct floppy_drive_struct *
0x00000018 FDRESET int
0x00000019 FDGETFDCSTAT struct floppy_fdc_state *
0x0000001B FDWERRORCLR void
0x0000001C FDWERRORGET struct floppy_write_errors *
0x0000001E FDRAWCMD struct floppy_raw_cmd * // MORE // I-O
0x00000028 FDTWADDLE void

//

0x0000125D BLKROSET const int *
0x0000125E BLKROGET int *
0x0000125F BLKRRPART void
0x00001260 BLKGETSIZE int *
0x00001261 BLKFLSBUF void
0x00001262 BLKRASET int
0x00001263 BLKRAGET int *
0x00000001 FIBMAP int * // I-O
0x00000002 FIGETBSZ int *

//

0x00000301 HDIO_GETGEO struct hd_geometry *
0x00000302 HDIO_GET_UNMASKINTR int *
0x00000304 HDIO_GET_MULTCOUNT int *
0x00000307 HDIO_GET_IDENTITY struct hd_driveid *
0x00000308 HDIO_GET_KEEPSETTINGS int *
0x00000309 HDIO_GET_CHIPSET int *
0x0000030A HDIO_GET_NOWERR int *
0x0000030B HDIO_GET_DMA int *
0x0000031F HDIO_DRIVE_CMD int * // I-O
0x00000321 HDIO_SET_MULTCOUNT int
0x00000322 HDIO_SET_UNMASKINTR int
0x00000323 HDIO_SET_KEEPSETTINGS int
0x00000324 HDIO_SET_CHIPSET int
0x00000325 HDIO_SET_NOWERR int
0x00000326 HDIO_SET_DMA int

//

0x000089F0 EQL_ENSLAVE struct ifreq * // MORE // I-O
0x000089F1 EQL_EMANCIPATE struct ifreq * // MORE // I-O
0x000089F2 EQL_GETSLAVECFG struct ifreq * // MORE // I-O
0x000089F3 EQL_SETSLAVECFG struct ifreq * // MORE // I-O
0x000089F4 EQL_GETMASTRCFG struct ifreq * // MORE // I-O
0x000089F5 EQL_SETMASTRCFG struct ifreq * // MORE // I-O

//

0x000089F0 SIOCDEVPLIP struct ifreq * // I-O

//

0x00005490 PPPIOCGFLAGS int *
0x00005491 PPPIOCSFLAGS const int *
0x00005492 PPPIOCGASYNCMAP int *
0x00005493 PPPIOCSASYNCMAP const int *
0x00005494 PPPIOCGUNIT int *
0x00005495 PPPIOCSINPSIG const int *
0x00005497 PPPIOCSDEBUG const int *
0x00005498 PPPIOCGDEBUG int *
0x00005499 PPPIOCGSTAT struct ppp_stats *
0x0000549A PPPIOCGTIME struct ppp_ddinfo *
0x0000549B PPPIOCGXASYNCMAP struct { int [8]; } *
0x0000549C PPPIOCSXASYNCMAP const struct { int [8]; } *
0x0000549D PPPIOCSMRU const int *
0x0000549E PPPIOCRASYNCMAP const int *
0x0000549F PPPIOCSMAXCID const int *

//

0x000089E0 SIOCAIPXITFCRT const char *
0x000089E1 SIOCAIPXPRISLT const char *
0x000089E2 SIOCIPXCFGDATA struct ipx_config_data *

//

0x00004B60 GIO_FONT struct { char [8192]; } *
0x00004B61 PIO_FONT const struct { char [8192]; } *
0x00004B6B GIO_FONTX struct console_font_desc * // MORE I-O
0x00004B6C PIO_FONTX const struct console_font_desc * //MORE
0x00004B70 GIO_CMAP struct { char [48]; } *
0x00004B71 PIO_CMAP const struct { char [48]; }
0x00004B2F KIOCSOUND int
0x00004B30 KDMKTONE int
0x00004B31 KDGETLED char *
0x00004B32 KDSETLED int
0x00004B33 KDGKBTYPE char *
0x00004B34 KDADDIO int // MORE
0x00004B35 KDDELIO int // MORE
0x00004B36 KDENABIO void // MORE
0x00004B37 KDDISABIO void // MORE
0x00004B3A KDSETMODE int
0x00004B3B KDGETMODE int *
0x00004B3C KDMAPDISP void // MORE
0x00004B3D KDUNMAPDISP void // MORE
0x00004B40 GIO_SCRNMAP struct { char [E_TABSZ]; } *
0x00004B41 PIO_SCRNMAP const struct { char [E_TABSZ]; } *
0x00004B69 GIO_UNISCRNMAP struct { short [E_TABSZ]; } *
0x00004B6A PIO_UNISCRNMAP const struct { short [E_TABSZ]; } *
0x00004B66 GIO_UNIMAP struct unimapdesc * // MORE // I-O
0x00004B67 PIO_UNIMAP const struct unimapdesc * // MORE
0x00004B68 PIO_UNIMAPCLR const struct unimapinit *
0x00004B44 KDGKBMODE int *
0x00004B45 KDSKBMODE int
0x00004B62 KDGKBMETA int *
0x00004B63 KDSKBMETA int
0x00004B64 KDGKBLED int *
0x00004B65 KDSKBLED int
0x00004B46 KDGKBENT struct kbentry * // I-O
0x00004B47 KDSKBENT const struct kbentry *
0x00004B48 KDGKBSENT struct kbsentry * // I-O
0x00004B49 KDSKBSENT const struct kbsentry *
0x00004B4A KDGKBDIACR struct kbdiacrs *
0x00004B4B KDSKBDIACR const struct kbdiacrs *
0x00004B4C KDGETKEYCODE struct kbkeycode * // I-O
0x00004B4D KDSETKEYCODE const struct kbkeycode *
0x00004B4E KDSIGACCEPT int

//

0x00000601 LPCHAR int
0x00000602 LPTIME int
0x00000604 LPABORT int
0x00000605 LPSETIRQ int
0x00000606 LPGETIRQ int *
0x00000608 LPWAIT int
0x00000609 LPCAREFUL int
0x0000060A LPABORTOPEN int
0x0000060B LPGETSTATUS int *
0x0000060C LPRESET void
0x0000060D LPGETSTATS struct lp_stats *

//

0x000089E0 SIOCGETVIFCNT struct sioc_vif_req * // I-O
0x000089E1 SIOCGETSGCNT struct sioc_sg_req * // I-O

//

0x40086D01 MTIOCTOP const struct mtop *
0x801C6D02 MTIOCGET struct mtget *
0x80046D03 MTIOCPOS struct mtpos *
0x80206D04 MTIOCGETCONFIG struct mtconfiginfo *
0x40206D05 MTIOCSETCONFIG const struct mtconfiginfo *

//

0x000089E0 SIOCNRGETPARMS struct nr_parms_struct * // I-O
0x000089E1 SIOCNRSETPARMS const struct nr_parms_struct *
0x000089E2 SIOCNRDECOBS void
0x000089E3 SIOCNRRTCTL const int *

//

0x00009000 DDIOCSDBG const int *
0x00005382 CDROMAUDIOBUFSIZ int

//

0x00005470 TIOCSCCINI void
0x00005471 TIOCCHANINI const struct scc_modem *
0x00005472 TIOCGKISS struct ioctl_command * // I-O
0x00005473 TIOCSKISS const struct ioctl_command *
0x00005474 TIOCSCCSTAT struct scc_stat *

//

0x00005382 SCSI_IOCTL_GET_IDLUN struct { int [2]; } *
0x00005383 SCSI_IOCTL_TAGGED_ENABLE void
0x00005384 SCSI_IOCTL_TAGGED_DISABLE void
0x00005385 SCSI_IOCTL_PROBE_HOST const int * // MORE

//

0x80027501 SMB_IOC_GETMOUNTUID uid_t *

//

0x0000890B SIOCADDRT const struct rtentry * // MORE
0x0000890C SIOCDELRT const struct rtentry * // MORE
0x00008910 SIOCGIFNAME char []
0x00008911 SIOCSIFLINK void
0x00008912 SIOCGIFCONF struct ifconf * // MORE // I-O
0x00008913 SIOCGIFFLAGS struct ifreq * // I-O
0x00008914 SIOCSIFFLAGS const struct ifreq *
0x00008915 SIOCGIFADDR struct ifreq * // I-O
0x00008916 SIOCSIFADDR const struct ifreq *
0x00008917 SIOCGIFDSTADDR struct ifreq * // I-O
0x00008918 SIOCSIFDSTADDR const struct ifreq *
0x00008919 SIOCGIFBRDADDR struct ifreq * // I-O
0x0000891A SIOCSIFBRDADDR const struct ifreq *
0x0000891B SIOCGIFNETMASK struct ifreq * // I-O
0x0000891C SIOCSIFNETMASK const struct ifreq *
0x0000891D SIOCGIFMETRIC struct ifreq * // I-O
0x0000891E SIOCSIFMETRIC const struct ifreq *
0x0000891F SIOCGIFMEM struct ifreq * // I-O
0x00008920 SIOCSIFMEM const struct ifreq *
0x00008921 SIOCGIFMTU struct ifreq * // I-O
0x00008922 SIOCSIFMTU const struct ifreq *
0x00008923 OLD_SIOCGIFHWADDR struct ifreq * // I-O
0x00008924 SIOCSIFHWADDR const struct ifreq * // MORE
0x00008925 SIOCGIFENCAP int *
0x00008926 SIOCSIFENCAP const int *
0x00008927 SIOCGIFHWADDR struct ifreq * // I-O
0x00008929 SIOCGIFSLAVE void
0x00008930 SIOCSIFSLAVE void
0x00008931 SIOCADDMULTI const struct ifreq *
0x00008932 SIOCDELMULTI const struct ifreq *
0x00008940 SIOCADDRTOLD void
0x00008941 SIOCDELRTOLD void
0x00008950 SIOCDARP const struct arpreq *
0x00008951 SIOCGARP struct arpreq * // I-O
0x00008952 SIOCSARP const struct arpreq *
0x00008960 SIOCDRARP const struct arpreq *
0x00008961 SIOCGRARP struct arpreq * // I-O
0x00008962 SIOCSRARP const struct arpreq *
0x00008970 SIOCGIFMAP struct ifreq * // I-O
0x00008971 SIOCSIFMAP const struct ifreq *

//

0x00005100 SNDCTL_SEQ_RESET void
0x00005101 SNDCTL_SEQ_SYNC void
0xC08C5102 SNDCTL_SYNTH_INFO struct synth_info * // I-O
0xC0045103 SNDCTL_SEQ_CTRLRATE int * // I-O
0x80045104 SNDCTL_SEQ_GETOUTCOUNT int *
0x80045105 SNDCTL_SEQ_GETINCOUNT int *
0x40045106 SNDCTL_SEQ_PERCMODE void
0x40285107 SNDCTL_FM_LOAD_INSTR const struct sbi_instrument *
0x40045108 SNDCTL_SEQ_TESTMIDI const int *
0x40045109 SNDCTL_SEQ_RESETSAMPLES const int *
0x8004510A SNDCTL_SEQ_NRSYNTHS int *
0x8004510B SNDCTL_SEQ_NRMIDIS int *
0xC074510C SNDCTL_MIDI_INFO struct midi_info * // I-O
0x4004510D SNDCTL_SEQ_THRESHOLD const int *
0xC004510E SNDCTL_SYNTH_MEMAVL int * // I-O
0x4004510F SNDCTL_FM_4OP_ENABLE const int *
0xCFB85110 SNDCTL_PMGR_ACCESS struct patmgr_info * // I-O
0x00005111 SNDCTL_SEQ_PANIC void
0x40085112 SNDCTL_SEQ_OUTOFBAND const struct seq_event_rec *
0xC0045401 SNDCTL_TMR_TIMEBASE int * // I-O
0x00005402 SNDCTL_TMR_START void
0x00005403 SNDCTL_TMR_STOP void
0x00005404 SNDCTL_TMR_CONTINUE void
0xC0045405 SNDCTL_TMR_TEMPO int * // I-O
0xC0045406 SNDCTL_TMR_SOURCE int * // I-O
0x40045407 SNDCTL_TMR_METRONOME const int *
0x40045408 SNDCTL_TMR_SELECT int * // I-O
0xCFB85001 SNDCTL_PMGR_IFACE struct patmgr_info * // I-O
0xC0046D00 SNDCTL_MIDI_PRETIME int * // I-O
0xC0046D01 SNDCTL_MIDI_MPUMODE const int *
0xC0216D02 SNDCTL_MIDI_MPUCMD struct mpu_command_rec * // I-O
0x00005000 SNDCTL_DSP_RESET void
0x00005001 SNDCTL_DSP_SYNC void
0xC0045002 SNDCTL_DSP_SPEED int * // I-O
0xC0045003 SNDCTL_DSP_STEREO int * // I-O
0xC0045004 SNDCTL_DSP_GETBLKSIZE int * // I-O
0xC0045006 SOUND_PCM_WRITE_CHANNELS int * // I-O
0xC0045007 SOUND_PCM_WRITE_FILTER int * // I-O
0x00005008 SNDCTL_DSP_POST void
0xC0045009 SNDCTL_DSP_SUBDIVIDE int * // I-O
0xC004500A SNDCTL_DSP_SETFRAGMENT int * // I-O
0x8004500B SNDCTL_DSP_GETFMTS int *
0xC0045005 SNDCTL_DSP_SETFMT int * // I-O
0x800C500C SNDCTL_DSP_GETOSPACE struct audio_buf_info *
0x800C500D SNDCTL_DSP_GETISPACE struct audio_buf_info *
0x0000500E SNDCTL_DSP_NONBLOCK void
0x80045002 SOUND_PCM_READ_RATE int *
0x80045006 SOUND_PCM_READ_CHANNELS int *
0x80045005 SOUND_PCM_READ_BITS int *
0x80045007 SOUND_PCM_READ_FILTER int *
0x00004300 SNDCTL_COPR_RESET void
0xCFB04301 SNDCTL_COPR_LOAD const struct copr_buffer *
0xC0144302 SNDCTL_COPR_RDATA struct copr_debug_buf * // I-O
0xC0144303 SNDCTL_COPR_RCODE struct copr_debug_buf * // I-O
0x40144304 SNDCTL_COPR_WDATA const struct copr_debug_buf *
0x40144305 SNDCTL_COPR_WCODE const struct copr_debug_buf *
0xC0144306 SNDCTL_COPR_RUN struct copr_debug_buf * // I-O
0xC0144307 SNDCTL_COPR_HALT struct copr_debug_buf * // I-O
0x4FA44308 SNDCTL_COPR_SENDMSG const struct copr_msg *
0x8FA44309 SNDCTL_COPR_RCVMSG struct copr_msg *
0x80044D00 SOUND_MIXER_READ_VOLUME int *
0x80044D01 SOUND_MIXER_READ_BASS int *
0x80044D02 SOUND_MIXER_READ_TREBLE int *
0x80044D03 SOUND_MIXER_READ_SYNTH int *
0x80044D04 SOUND_MIXER_READ_PCM int *
0x80044D05 SOUND_MIXER_READ_SPEAKER int *
0x80044D06 SOUND_MIXER_READ_LINE int *
0x80044D07 SOUND_MIXER_READ_MIC int *
0x80044D08 SOUND_MIXER_READ_CD int *
0x80044D09 SOUND_MIXER_READ_IMIX int *
0x80044D0A SOUND_MIXER_READ_ALTPCM int *
0x80044D0B SOUND_MIXER_READ_RECLEV int *
0x80044D0C SOUND_MIXER_READ_IGAIN int *
0x80044D0D SOUND_MIXER_READ_OGAIN int *
0x80044D0E SOUND_MIXER_READ_LINE1 int *
0x80044D0F SOUND_MIXER_READ_LINE2 int *
0x80044D10 SOUND_MIXER_READ_LINE3 int *
0x80044D1C SOUND_MIXER_READ_MUTE int *
0x80044D1D SOUND_MIXER_READ_ENHANCE int *
0x80044D1E SOUND_MIXER_READ_LOUD int *
0x80044DFF SOUND_MIXER_READ_RECSRC int *
0x80044DFE SOUND_MIXER_READ_DEVMASK int *
0x80044DFD SOUND_MIXER_READ_RECMASK int *
0x80044DFB SOUND_MIXER_READ_STEREODEVS int *
0x80044DFC SOUND_MIXER_READ_CAPS int *
0xC0044D00 SOUND_MIXER_WRITE_VOLUME int * // I-O
0xC0044D01 SOUND_MIXER_WRITE_BASS int * // I-O
0xC0044D02 SOUND_MIXER_WRITE_TREBLE int * // I-O
0xC0044D03 SOUND_MIXER_WRITE_SYNTH int * // I-O
0xC0044D04 SOUND_MIXER_WRITE_PCM int * // I-O
0xC0044D05 SOUND_MIXER_WRITE_SPEAKER int * // I-O
0xC0044D06 SOUND_MIXER_WRITE_LINE int * // I-O
0xC0044D07 SOUND_MIXER_WRITE_MIC int * // I-O
0xC0044D08 SOUND_MIXER_WRITE_CD int * // I-O
0xC0044D09 SOUND_MIXER_WRITE_IMIX int * // I-O
0xC0044D0A SOUND_MIXER_WRITE_ALTPCM int * // I-O
0xC0044D0B SOUND_MIXER_WRITE_RECLEV int * // I-O
0xC0044D0C SOUND_MIXER_WRITE_IGAIN int * // I-O
0xC0044D0D SOUND_MIXER_WRITE_OGAIN int * // I-O
0xC0044D0E SOUND_MIXER_WRITE_LINE1 int * // I-O
0xC0044D0F SOUND_MIXER_WRITE_LINE2 int * // I-O
0xC0044D10 SOUND_MIXER_WRITE_LINE3 int * // I-O
0xC0044D1C SOUND_MIXER_WRITE_MUTE int * // I-O
0xC0044D1D SOUND_MIXER_WRITE_ENHANCE int * // I-O
0xC0044D1E SOUND_MIXER_WRITE_LOUD int * // I-O
0xC0044DFF SOUND_MIXER_WRITE_RECSRC int * // I-O

//

0x000004D2 UMSDOS_READDIR_DOS struct umsdos_ioctl * // I-O
0x000004D3 UMSDOS_UNLINK_DOS const struct umsdos_ioctl *
0x000004D4 UMSDOS_RMDIR_DOS const struct umsdos_ioctl *
0x000004D5 UMSDOS_STAT_DOS struct umsdos_ioctl * // I-O
0x000004D6 UMSDOS_CREAT_EMD const struct umsdos_ioctl *
0x000004D7 UMSDOS_UNLINK_EMD const struct umsdos_ioctl *
0x000004D8 UMSDOS_READDIR_EMD struct umsdos_ioctl * // I-O
0x000004D9 UMSDOS_GETVERSION struct umsdos_ioctl *
0x000004DA UMSDOS_INIT_EMD void
0x000004DB UMSDOS_DOS_SETUP const struct umsdos_ioctl *
0x000004DC UMSDOS_RENAME_DOS const struct umsdos_ioctl *

//

0x00005600 VT_OPENQRY int *
0x00005601 VT_GETMODE struct vt_mode *
0x00005602 VT_SETMODE const struct vt_mode *
0x00005603 VT_GETSTATE struct vt_stat *
0x00005604 VT_SENDSIG void
0x00005605 VT_RELDISP int
0x00005606 VT_ACTIVATE int
0x00005607 VT_WAITACTIVE int
0x00005608 VT_DISALLOCATE int
0x00005609 VT_RESIZE const struct vt_sizes *
0x0000560A VT_RESIZEX const struct vt_consize *

// More arguments.

Some ioctl’s take a pointer to a structure which contains additional
pointers. These are documented here in alphabetical order.

CDROMREADAUDIO takes an input pointer ’const struct cdrom_read_audio *’.
The ’buf’ field points to an output buffer
of length ’nframes * CD_FRAMESIZE_RAW’.

CDROMREADCOOKED, CDROMREADMODE1, CDROMREADMODE2, and CDROMREADRAW take
an input pointer ’const struct cdrom_msf *’. They use the same pointer
as an output pointer to ’char []’. The length varies by request. For
CDROMREADMODE1, most drivers use ’CD_FRAMESIZE’, but the Optics Storage
driver uses ’OPT_BLOCKSIZE’ instead (both have the numerical value
2048).

CDROMREADCOOKED char [CD_FRAMESIZE]
CDROMREADMODE1 char [CD_FRAMESIZE or OPT_BLOCKSIZE]
CDROMREADMODE2 char [CD_FRAMESIZE_RAW0]
CDROMREADRAW char [CD_FRAMESIZE_RAW]

EQL_ENSLAVE, EQL_EMANCIPATE, EQL_GETSLAVECFG, EQL_SETSLAVECFG,
EQL_GETMASTERCFG, and EQL_SETMASTERCFG take a ’struct ifreq *’.
The ’ifr_data’ field is a pointer to another structure as follows:

EQL_ENSLAVE const struct slaving_request *
EQL_EMANCIPATE const struct slaving_request *
EQL_GETSLAVECFG struct slave_config * // I-O
EQL_SETSLAVECFG const struct slave_config *
EQL_GETMASTERCFG struct master_config *
EQL_SETMASTERCFG const struct master_config *

FDRAWCMD takes a ’struct floppy raw_cmd *’. If ’flags & FD_RAW_WRITE’
is non-zero, then ’data’ points to an input buffer of length ’length’.
If ’flags & FD_RAW_READ’ is non-zero, then ’data’ points to an output
buffer of length ’length’.

GIO_FONTX and PIO_FONTX take a ’struct console_font_desc *’ or
a ’const struct console_font_desc *’, respectively. ’chardata’ points to
a buffer of ’char [charcount]’. This is an output buffer for GIO_FONTX
and an input buffer for PIO_FONTX.

GIO_UNIMAP and PIO_UNIMAP take a ’struct unimapdesc *’ or
a ’const struct unimapdesc *’, respectively. ’entries’ points to a buffer
of ’struct unipair [entry_ct]’. This is an output buffer for GIO_UNIMAP
and an input buffer for PIO_UNIMAP.

KDADDIO, KDDELIO, KDDISABIO, and KDENABIO enable or disable access to
I/O ports. They are essentially alternate interfaces to ’ioperm’.

KDMAPDISP and KDUNMAPDISP enable or disable memory mappings or I/O port
access. They are not implemented in the kernel.

SCSI_IOCTL_PROBE_HOST takes an input pointer ’const int *’, which is a
length. It uses the same pointer as an output pointer to a ’char []’
buffer of this length.

SIOCADDRT and SIOCDELRT take an input pointer whose type depends on
the protocol:

Most protocols const struct rtentry *
AX.25 const struct ax25_route *
NET/ROM const struct nr_route_struct *

SIOCGIFCONF takes a ’struct ifconf *’. The ’ifc_buf’ field points to a
buffer of length ’ifc_len’ bytes, into which the kernel writes a list of
type ’struct ifreq []’.

SIOCSIFHWADDR takes an input pointer whose type depends on the protocol:

Most protocols const struct ifreq *
AX.25 const char [AX25_ADDR_LEN]

TIOCLINUX takes a ’const char *’. It uses this to distinguish several
independent sub-cases. In the table below, ’N + foo’ means ’foo’ after
an N-byte pad. ’struct selection’ is implicitly defined
in ’drivers/char/selection.c’

TIOCLINUX-2 1 + const struct selection *
TIOCLINUX-3 void
TIOCLINUX-4 void
TIOCLINUX-5 4 + const struct { long [8]; } *
TIOCLINUX-6 char *
TIOCLINUX-7 char *
TIOCLINUX-10 1 + const char *

// Duplicate ioctls

This list does not include ioctls in the range SIOCDEVPRIVATE and
SIOCPROTOPRIVATE.

0x00000001 FDSETPRM FIBMAP
0x00000002 FDDEFPRM FIGETBSZ
0x00005382 CDROMAUDIOBUFSIZ SCSI_IOCTL_GET_IDLUN
0x00005402 SNDCTL_TMR_START TCSETS
0x00005403 SNDCTL_TMR_STOP TCSETSW
0x00005404 SNDCTL_TMR_CONTINUE TCSETSF

io_destroy()函数

io_destroy - 销毁异步I / O上下文

内容简介

#include <libaio.h>

标签	描述
int io_destroy (io_context_t ctx);

描述

io_destroy() removes the asynchronous I/O context from the list of I/O contexts and then destroys it. io_destroy() can also cancel any outstanding asynchronous I/O actions on ctx and block on completion.

返回值

io_destroy() 成功返回0.

错误

标签	描述
EINVAL	The AIO context specified by ctx is invalid.
EFAULT	The context yiibaied to is invalid.
ENOSYS	io_destroy() is not implemented on this architecture.

遵循于

io_destroy() 是Linux特有的，并应在该旨在是可移植的程序不被使用。

版本

The asynchronous I/O system calls first appeared in Linux 2.5, August 2002.

另请参阅

io_setup(2), io_submit(2), io_getevents(2), io_cancel(2).

注意

The asynchronous I/O system calls were written by Benjamin LaHaise.

作者

Kent Yoder.

io_getevents()函数

io_getevents - 读取异步I/ O事件从队列中完成

内容简介

#include <linux/time.h>

#include <libaio.h>

标签	描述
long io_getevents (aio_context_t ctx_id, long min_nr, long nr, struct io_eventevents, struct timespec timeout);

描述

io_getevents() attempts to read at least min_nr events and up to nr events from the completion queue of the AIO context specified by ctx_id. timeout specifies the amount of time to wait for events, where a NULL timeout waits until at least min_nr events have been seen. Note that timeout is relative and will be updated if not NULL and the operation blocks.

返回值

io_getevents() returns the number of events read: 0 if no events are available or <min_nr if the timeout has elapsed.

错误

标签	描述
EINVAL	ctx_id is invalid. min_nr is out of range or nr is out of range.
EFAULT	Either events or timeout is an invalid yiibaier.
ENOSYS	io_getevents() is not implemented on this architecture.

遵循于

io_getevents() 是Linux特有的，并应在该旨在是可移植的程序不被使用。

版本

The asynchronous I/O system calls first appeared in Linux 2.5, August 2002.

另请参阅

io_setup(2), io_submit(2), io_getevents(2), io_cancel(2), io_destroy(2).

注意

The asynchronous I/O system calls were written by Benjamin LaHaise.

作者

Kent Yoder.

ioperm()函数

ioperm - 设置端口输入/输出权限

内容简介

#include <unistd.h> /* for libc5 */
#include <sys/io.h> /* for glibc */

int ioperm(unsigned long from, unsigned long num, int turn_on);

描述

Ioperm sets the port access permission bits for the process for num bytes starting from port address from to the value turn_on. The use of ioperm() requires root privileges.

Only the first 0x3ff I/O ports can be specified in this manner. For more ports, the iopl() function must be used. Permissions are not inherited on fork(), but on exec() they are. This is useful for giving port access permissions to non-privileged tasks.

这个调用主要是为i386体系结构。在许多其它体系结构不存在或将总是返回一个错误。

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EINVAL	Invalid values for from or num.
EIO	(on ppc) This call is not supported.
EPERM	The calling process has insufficient privilege to call ioperm(); theCAP_SYS_RAWIO capability is required.

遵循于

ioperm() 是Linux特有的，应在拟移植的程序不能使用。

注意

Libc5 treats it as a system call and has a prototype in <unistd.h>. Glibc1 does not have a prototype. Glibc2 has a prototype both in <sys/io.h> and in <sys/perm.h>. Avoid the latter, it is available on i386 only.

另请参阅

iopl (2)

iopl()函数

iopl - 改变I / O权限级别

内容简介

#include <sys/io.h>

int iopl(int level);

描述

iopl() 改变当前进程的I/ O特权级别，在级别 level 指定。 .

This call is necessary to allow 8514-compatible X servers to run under Linux. Since these X servers require access to all 65536 I/O ports, the ioperm() call is not sufficient.

In addition to granting unrestricted I/O port access, running at a higher I/O privilege level also allows the process to disable interrupts. This will probably crash the system, and is not recommended.

Permissions are inherited by fork() and exec().

对于一个正常的过程I / O的优先级为0。

这个调用主要是为i386体系结构。在许多其它体系结构不存在或将总是返回一个错误。

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EINVAL	level is greater than 3.
ENOSYS	This call is unimplemented.
EPERM	The calling process has insufficient privilege to call iopl(); theCAP_SYS_RAWIO capability is required.

遵循于

iopl() is Linux specific and should not be used in processes intended to be portable.

注意

另请参阅

ioperm (2)

ioprio_set()函数

ioprio_get, ioprio_set - 获取/设置I / O调度类和优先级

内容简介

int ioprio_get(int
which
, int
who
);

int ioprio_set(int
which
, int
who
, int
ioprio
);

描述

ioprio_get() and ioprio_set() 系统调用分别获取和设置一个或多个进程的I / O调度类和优先级。

The which and who arguments identify the process(es) on which the system calls operate. The which argument determines how who is interpreted, and has one of the following values:

标签	描述
IOPRIO_WHO_PROCESS
	who is a process ID identifying a single process.
IOPRIO_WHO_PGRP
	who is a process group ID identifying all the members of a process group.
IOPRIO_WHO_USER
	who is a user ID identifying all of the processes that have a matching real UID.
If which is specified as IOPRIO_WHO_PGRP or IOPRIO_WHO_USER when callingioprio_get(), and more than one process matches who, then the returned priority will be the highest one found among all of the matching processes. One priority is said to be higher than another one if it belongs to a higher priority class (IOPRIO_CLASS_RT is the highest priority class; IOPRIO_CLASS_IDLE is the lowest) or if it belongs to the same priority class as the other process but has a higher priority level (a lower priority number means a higher priority level). The ioprio argument given to ioprio_set() is a bit mask that specifies both the scheduling class and the priority to be assigned to the target process(es). The following macros are used for assembling and dissecting ioprio values:
IOPRIO_PRIO_VALUE(class, data)
	Given a scheduling class and priority (data), this macro combines the two values to produce an ioprio value, which is returned as the result of the macro.
IOPRIO_PRIO_CLASS(mask)
	Given mask (an ioprio value), this macro returns its I/O class component, that is, one of the values IOPRIO_CLASS_RT,IOPRIO_CLASS_BE, or IOPRIO_CLASS_IDLE.
IOPRIO_PRIO_DATA(mask)
	Given mask (an ioprio value), this macro returns its priority (data) component.

See the NOTES section for more information on scheduling classes and priorities.

I/O priorities are supported for reads and for synchronous (O_DIRECT, O_SYNC) writes. I/O priorities are not supported for asynchronous writes because they are issued outside the context of the program dirtying the memory, and thus program-specific priorities do not apply.

返回值

On success, ioprio_get() returns the ioprio value of the process with highest I/O priority of any of the processes that match the criteria specified in which and who. On error, -1 is returned, and errno is set to indicate the error.

On success, ioprio_set() returns 0. On error, -1 is returned, and errno is set to indicate the error.

错误

标签	描述
EPERM	The calling process does not have the privilege needed to assign this ioprio to the specified process(es). See the NOTES section for more information on required privileges forioprio_set().
ESRCH	No process(es) could be found that matched the specification inwhich and who.
EINVAL	Invalid value for which or ioprio. Refer to the NOTES section for available scheduler classes and priority levels for ioprio.

VERSIONS

These system calls have been available on Linux since kernel 2.6.13.

遵循于

这些系统调用是Linux特有的。

注意

Glibc does not provide wrapper for these system calls; call them using syscall(2).

These system calls only have an effect when used in conjunction with an I/O scheduler that supports I/O priorities. As at kernel 2.6.17 the only such scheduler is the Completely Fair Queuing (CFQ) I/O scheduler.

Selecting an I/O Scheduler

I/O Schedulers are selected on a per-device basis via the special file/sys/block/<device>/queue/scheduler.

One can view the current I/O scheduler via the /sys file system. For example, the following command displays a list of all schedulers currently loaded in the kernel:

$ cat /sys/block/hda/queue/scheduler
noop anticipatory deadline [cfq]

The scheduler surrounded by brackets is the one actually in use for the device (hda in the example). Setting another scheduler is done by writing the name of the new scheduler to this file. For example, the following command will set the scheduler for thehda device to cfq:

$ su
Password:
# echo cfq > /sys/block/hda/queue/scheduler

完全公平队列（CFQ）的I / O调度

Since v3 (aka CFQ Time Sliced) CFQ implements I/O nice levels similar to those of CPU scheduling. These nice levels are grouped in three scheduling classes each one containing one or more priority levels:

标签	描述
IOPRIO_CLASS_RT (1)
	This is the real-time I/O class. This scheduling class is given higher priority than any other class: processes from this class are given first access to the disk every time. Thus this I/O class needs to be used with some care: one I/O real-time process can starve the entire system. Within the real-time class, there are 8 levels of class data (priority) that determine exactly how much time this process needs the disk for on each service. The highest real-time priority level is 0; the lowest is 7. In the future this might change to be more directly mappable to performance, by passing in a desired data rate instead.
IOPRIO_CLASS_BE (2)
	This is the best-effort scheduling class, which is the default for any process that hasn’t set a specific I/O priority. The class data (priority) determines how much I/O bandwidth the process will get. Best-effort priority levels are analogous to CPU nice values (see getpriority(2)). The priority level determines a priority relative to other processes in the best-effort scheduling class. Priority levels range from 0 (highest) to 7 (lowest).
IOPRIO_CLASS_IDLE (3)
	This is the idle scheduling class. Processes running at this level only get I/O time when no one else needs the disk. The idle class has no class data. Attention is required when assigning this priority class to a process, since it may become starved if higher priority processes are constantly accessing the disk.

Refer to Documentation/block/ioprio.txt for more information on the CFQ I/O Scheduler and an example program.

所需的权限设置I/ O优先级

权限更改进程的优先级被授予或拒绝基于两个参数：

标签	描述
Process ownership
	An unprivileged process may only set the I/O priority of a process whose real UID matches the real or effective UID of the calling process. A process which has the CAP_SYS_NICEcapability can change the priority of any process.
What is the desired priority
	Attempts to set very high priorities (IOPRIO_CLASS_RT) or very low ones (IOPRIO_CLASS_IDLE) require theCAP_SYS_ADMIN capability.

A call to ioprio_set() must follow both rules, or the call will fail with the error EPERM.

BUGS

Glibc does not yet provide a suitable header file defining the function prototypes and macros described on this page. Suitable definitions can be found in linux/ioprio.h.

另请参阅

Documentation/block/ioprio.txt in the kernel source tree.

ioprio_set()函数

ioprio_get, ioprio_set - 获取/设置I / O调度类和优先级

内容简介

int ioprio_get(int
which
, int
who
);

int ioprio_set(int
which
, int
who
, int
ioprio
);

描述

ioprio_get() and ioprio_set() 系统调用分别获取和设置一个或多个进程的I / O调度类和优先级。

The which and who arguments identify the process(es) on which the system calls operate. The which argument determines how who is interpreted, and has one of the following values:

标签	描述
IOPRIO_WHO_PROCESS
	who is a process ID identifying a single process.
IOPRIO_WHO_PGRP
	who is a process group ID identifying all the members of a process group.
IOPRIO_WHO_USER
	who is a user ID identifying all of the processes that have a matching real UID.
If which is specified as IOPRIO_WHO_PGRP or IOPRIO_WHO_USER when callingioprio_get(), and more than one process matches who, then the returned priority will be the highest one found among all of the matching processes. One priority is said to be higher than another one if it belongs to a higher priority class (IOPRIO_CLASS_RT is the highest priority class; IOPRIO_CLASS_IDLE is the lowest) or if it belongs to the same priority class as the other process but has a higher priority level (a lower priority number means a higher priority level). The ioprio argument given to ioprio_set() is a bit mask that specifies both the scheduling class and the priority to be assigned to the target process(es). The following macros are used for assembling and dissecting ioprio values:
IOPRIO_PRIO_VALUE(class, data)
	Given a scheduling class and priority (data), this macro combines the two values to produce an ioprio value, which is returned as the result of the macro.
IOPRIO_PRIO_CLASS(mask)
	Given mask (an ioprio value), this macro returns its I/O class component, that is, one of the values IOPRIO_CLASS_RT,IOPRIO_CLASS_BE, or IOPRIO_CLASS_IDLE.
IOPRIO_PRIO_DATA(mask)
	Given mask (an ioprio value), this macro returns its priority (data) component.

See the NOTES section for more information on scheduling classes and priorities.

返回值

On success, ioprio_set() returns 0. On error, -1 is returned, and errno is set to indicate the error.

错误

标签	描述
EPERM	The calling process does not have the privilege needed to assign this ioprio to the specified process(es). See the NOTES section for more information on required privileges forioprio_set().
ESRCH	No process(es) could be found that matched the specification inwhich and who.
EINVAL	Invalid value for which or ioprio. Refer to the NOTES section for available scheduler classes and priority levels for ioprio.

VERSIONS

These system calls have been available on Linux since kernel 2.6.13.

遵循于

这些系统调用是Linux特有的。

注意

Glibc does not provide wrapper for these system calls; call them using syscall(2).

Selecting an I/O Scheduler

I/O Schedulers are selected on a per-device basis via the special file/sys/block/<device>/queue/scheduler.

One can view the current I/O scheduler via the /sys file system. For example, the following command displays a list of all schedulers currently loaded in the kernel:

$ cat /sys/block/hda/queue/scheduler
noop anticipatory deadline [cfq]

$ su
Password:
# echo cfq > /sys/block/hda/queue/scheduler

完全公平队列（CFQ）的I / O调度

标签	描述
IOPRIO_CLASS_RT (1)
	This is the real-time I/O class. This scheduling class is given higher priority than any other class: processes from this class are given first access to the disk every time. Thus this I/O class needs to be used with some care: one I/O real-time process can starve the entire system. Within the real-time class, there are 8 levels of class data (priority) that determine exactly how much time this process needs the disk for on each service. The highest real-time priority level is 0; the lowest is 7. In the future this might change to be more directly mappable to performance, by passing in a desired data rate instead.
IOPRIO_CLASS_BE (2)
	This is the best-effort scheduling class, which is the default for any process that hasn’t set a specific I/O priority. The class data (priority) determines how much I/O bandwidth the process will get. Best-effort priority levels are analogous to CPU nice values (see getpriority(2)). The priority level determines a priority relative to other processes in the best-effort scheduling class. Priority levels range from 0 (highest) to 7 (lowest).
IOPRIO_CLASS_IDLE (3)
	This is the idle scheduling class. Processes running at this level only get I/O time when no one else needs the disk. The idle class has no class data. Attention is required when assigning this priority class to a process, since it may become starved if higher priority processes are constantly accessing the disk.

Refer to Documentation/block/ioprio.txt for more information on the CFQ I/O Scheduler and an example program.

所需的权限设置I/ O优先级

权限更改进程的优先级被授予或拒绝基于两个参数：

标签	描述
Process ownership
	An unprivileged process may only set the I/O priority of a process whose real UID matches the real or effective UID of the calling process. A process which has the CAP_SYS_NICEcapability can change the priority of any process.
What is the desired priority
	Attempts to set very high priorities (IOPRIO_CLASS_RT) or very low ones (IOPRIO_CLASS_IDLE) require theCAP_SYS_ADMIN capability.

A call to ioprio_set() must follow both rules, or the call will fail with the error EPERM.

BUGS

Glibc does not yet provide a suitable header file defining the function prototypes and macros described on this page. Suitable definitions can be found in linux/ioprio.h.

另请参阅

Documentation/block/ioprio.txt in the kernel source tree.

io_setup()函数

io_setup - 创建一个异步I / O的上下文

内容简介

#include <libaio.h>

标签	描述
int io_setup (int maxevents, io_context_t *ctxp);

描述

io_setup() creates an asynchronous I/O context capable of receiving at leastmaxevents. ctxp must not yiibai to an AIO context that already exists, and must be initialized to 0 prior to the call. On successful creation of the AIO context, *ctxp is filled in with the resulting handle.

返回值

io_setup() returns 0 on success; otherwise, one of the errors listed in the "Errors" section is returned.

错误

标签	描述
EINVAL	ctxp is not initialized, or the specified maxevents exceeds internal limits. maxevents should be greater than 0.
EFAULT	An invalid yiibaier is passed for ctxp.
ENOMEM	Insufficient kernel resources are available.
EAGAIN	The specified maxevents exceeds the user’s limit of available events.
ENOSYS	io_setup() is not implemented on this architecture.

遵循于

io_setup() 是Linux特有的，并应在该旨在是可移植的程序不被使用。

VERSIONS

The asynchronous I/O system calls first appeared in Linux 2.5, August 2002.

另请参阅

io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2).

注意

The asynchronous I/O system calls were written by Benjamin LaHaise.

AUTHOR

Kent Yoder.

io_submit()函数

io_submit - 提交处理异步I/ O模块

内容简介

#include <libaio.h>

标签	描述
long io_submit (aio_context_t ctx_id, long nr, struct iocb **iocbpp);

描述

io_submit() queues nr I/O request blocks for processing in the AIO context ctx_id.iocbpp should be an array of nr AIO request blocks, which will be submitted to contextctx_id.

返回值

io_submit() returns the number of iocbs submitted and 0 if nr is zero.

错误

标签	描述
EINVAL	The aio_context specified by ctx_id is invalid. nr is less than 0. The iocb at *iocbpp[0] is not properly initialized, or the operation specified is invalid for the file descriptor in the iocb.
EFAULT	One of the data structures yiibais to invalid data.
EBADF	The file descriptor specified in the first iocb is invalid.
EAGAIN	Insufficient resources are available to queue any iocbs.
ENOSYS	io_submit() is not implemented on this architecture.

遵循于

io_submit() 是Linux特有的，并应在该旨在是可移植的程序不被使用。

版本

The asynchronous I/O system calls first appeared in Linux 2.5, August 2002.

另请参阅

io_setup(2), io_destroy(2), io_getevents(2), io_cancel(2).

注意

The asynchronous I/O system calls were written by Benjamin LaHaise.

作者

Kent Yoder.

ipc()函数

ipc - 系统V IPC系统调用

内容简介

int ipc(unsigned int
call
, int
first
, int
second
,

int
third
, void *
ptr
, long
fifth
);

描述

ipc() is a common kernel entry point for the System V IPC calls for messages, semaphores, and shared memory. call determines which IPC function to invoke; the other arguments are passed through to the appropriate call.

User programs should call the appropriate functions by their usual names. Only standard library implementors and kernel hackers need to know about ipc().

遵循于

ipc() 是Linux特有的，并应在拟移植的程序不能使用。

另请参阅

isastream()函数

afs_syscall, break, fattach, fdetach, ftime, getmsg, getpmsg, gtty, isastream, lock, mpx, multiplexer, prof, profil, putmsg, putpmsg, security, stty, ulimit, vserver - 未实现系统调用

内容简介

未实现系统调用

描述

These system calls are not implemented in the Linux 2.4 kernel.

返回值

These system calls always return -1 and set errno to ENOSYS.

注意

Note that ftime(3), profil(3) and ulimit(3) are implemented as library functions.

Some system calls, like alloc_hugepages(2), free_hugepages(2), ioperm(2), iopl(2), and vm86(2) only exist on certain architectures.

Some system calls, like ipc(2), create_module(2), init_module(2), anddelete_module(2) only exist when the Linux kernel was built with support for them.

另请参阅

obsolete (2)

kexec_load()函数

kexec_load -加载新的内核映像到内存

内容简介

#include <syscall.h>

#include <kexec.h>

long kexec_load(unsigned long entry, unsigned long nr_segments,
struct kexec_segment *flags);

描述

kexec_load 加载从当前地址空间中的新内核。这个系统调用只能用于由root。

条目是一个指向新加载的可执行映像的入口点。这是内核将跳转到并开始执行新加载的图像的指令的存储器位置。

nr_segments denotes the number of segments which will be passed to kexec_load. The value must not be greater than KEXEC_SEGMENT_MAX.

segments denotes a pointer to the first element of an array of kexec_segmentelements. A kexec_segment element contains the details of a segment to be loaded in memory.

flags Sixteen most significant bits of the flag are used to communicate the architecture information (KEXEC_ARCH_*). The values for various architectures are same as defined by ELF specifications. Lower sixteen bits have been reserved for miscellaneous information. Currently only one bit is being used and rest fifteen have been reserved for future use. The least significant bit (KEXEC_ON_CRASH) can be set to inform the kernel that the memory memory image being loaded is to be executed upon a system crash and not regular boot. For regular boot, this bit is cleared.

返回值

On success, zero is returned. On error, nonzero value is returned, and errno is set appropriately.

错误

EPERM the calling process has not sufficient permissions (is not root).

EINVAL the flags argument contains an invalid combination of flags, or nr_segments is greater than KEXEC_SEGMENT_MAX.

ENOMEM there is not enough memory to store the kernel image.

EBUSY the memory location which should be written to is not available now.

可用性

This syscall is implemented only since kernel 2.6.1

keyctl()函数

keyctl - 操作内核的密钥管理工具

内容简介

#include <keyutils.h>

long keyctl(int cmd, ...);

描述

keyctl() 有许多功能可用：

标签	描述
KEYCTL_GET_KEYRING_ID
	Ask for a keyring’s ID.
KEYCTL_JOIN_SESSION_KEYRING
	Join or start named session keyring.
KEYCTL_UPDATE
	Update a key.
KEYCTL_REVOKE
	Revoke a key.
KEYCTL_CHOWN
	Set ownership of a key.
KEYCTL_SETPERM
	Set perms on a key.
KEYCTL_DESCRIBE
	Describe a key.
KEYCTL_CLEAR
	Clear contents of a keyring.
KEYCTL_LINK
	Link a key into a keyring.
KEYCTL_UNLINK
	Unlink a key from a keyring.
KEYCTL_SEARCH
	Search for a key in a keyring.
KEYCTL_READ
	Read a key or keyring’s contents.
KEYCTL_INSTANTIATE
	Instantiate a partially constructed key.
KEYCTL_NEGATE
	Negate a partially constructed key.
KEYCTL_SET_REQKEY_KEYRING
	Set default request-key keyring.
KEYCTL_SET_TIMEOUT
	Set timeout on a key.
KEYCTL_ASSUME_AUTHORITY
	Assume authority to instantiate key.

These are wrapped by libkeyutils into individual functions to permit compiler the compiler to check types. See the See Also section at the bottom.

返回值

On success keyctl() returns the serial number of the key it found. On error, the value -1will be returned and errno will have been set to an appropriate error.

错误

标签	描述
ENOKEY	No matching key was found or an invalid key was specified.
EKEYEXPIRED
	An expired key was found or specified.
EKEYREVOKED
	A revoked key was found or specified.
EKEYREJECTED
	A rejected key was found or specified.
EDQUOT	The key quota for the caller’s user would be exceeded by creating a key or linking it to the keyring.
EACCES	A key operation wasn’t permitted.

LINKING

Although this is a Linux system call, it is not present in libc but can be found rather inlibkeyutils. When linking, -lkeyutils should be specified to the linker.

另请参阅

keyctl (1)

add_key(2), request_key(2), keyctl_get_keyring_ID(3), keyctl_join_session_keyring(3), keyctl_update(3), keyctl_revoke(3), keyctl_chown(3), keyctl_setperm(3), keyctl_describe(3), keyctl_clear(3), keyctl_link(3), keyctl_unlink(3), keyctl_search(3), keyctl_read(3), keyctl_instantiate(3), keyctl_negate(3), keyctl_set_reqkey_keyring(3), keyctl_set_timeout(3), keyctl_assume_authority(3), keyctl_describe_alloc(3), keyctl_read_alloc(3), request-key(8)

kill()函数

kill - 发送信号给一个进程

内容简介

#include <sys/types.h>

#include <signal.h>

int kill(pid_t pid, int sig);

描述

kill() 系统调用可以用来发送任何信号，任何进程组或进程。

If pid is positive, then signal sig is sent to pid.

If pid equals 0, then sig is sent to every process in the process group of the current process.

If pid equals -1, then sig is sent to every process for which the calling process has permission to send signals, except for process 1 (init), but see below.

If pid is less than -1, then sig is sent to every process in the process group -pid.

If sig is 0, then no signal is sent, but error checking is still performed.

For a process to have permission to send a signal it must either be privileged (under Linux: have the CAP_KILL capability), or the real or effective user ID of the sending process must equal the real or saved set-user-ID of the target process. In the case of SIGCONT it suffices when the sending and receiving processes belong to the same session.

返回值

On success (at least one signal was sent), zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EINVAL	An invalid signal was specified.
EPERM	The process does not have permission to send the signal to any of the target processes.
ESRCH	The pid or process group does not exist. Note that an existing process might be a zombie, a process which already committed termination, but has not yet been wait()ed for.

注意

可发送任务一把手，init进程的唯一信号，是那些已经初始化安装了明确的信号处理程序。这样做是为了保证系统不放倒意外。

POSIX.1-2001 requires that kill(-1,sig) send sig to all processes that the current process may send signals to, except possibly for some implementation-defined system processes. Linux allows a process to signal itself, but on Linux the call kill(-1,sig) does not signal the current process.

POSIX.1-2001 requires that if a process sends a signal to itself, and the sending thread does not have the signal blocked, and no other thread has it unblocked or is waiting for it in sigwait(), at least one unblocked signal must be delivered to the sending thread before the kill().

BUGS

In 2.6 kernels up to and including 2.6.7, there was a bug that meant that when sending signals to a process group, kill() failed with the error EPERM if the caller did have permission to send the signal to any (rather than all) of the members of the process group. Notwithstanding this error return, the signal was still delivered to all of the processes for which the caller had permission to signal.

LINUX HISTORY

Across different kernel versions, Linux has enforced different rules for the permissions required for an unprivileged process to send a signal to another process. In kernels 1.0 to 1.2.2, a signal could be sent if the effective user ID of the sender matched that of the receiver, or the real user ID of the sender matched that of the receiver. From kernel 1.2.3 until 1.3.77, a signal could be sent if the effective user ID of the sender matched either the real or effective user ID of the receiver. The current rules, which conform to POSIX.1-2001, were adopted in kernel 1.3.78.

遵循于

SVr4, 4.3BSD, POSIX.1-2001

另请参阅

killpg()函数

killpg - 发送信号给进程组

内容简介

#include <signal.h>

int killpg(int pgrp, int sig);

描述

killpg() sends the signal sig to the process group pgrp. See signal(7) for a list of signals. If pgrp is 0, killpg() sends the signal to the sending process’s process group.

(POSIX says: If pgrp is less than or equal to 1, the behaviour is undefined.)

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EINVAL	Sig is not a valid signal number.
EPERM	The process does not have permission to send the signal to any of the target processes.
ESRCH	No process can be found in the process group specified by pgrp.
ESRCH	The process group was given as 0 but the sending process does not have a process group.

注意

There are various differences between the permission checking in BSD-type systems and System V-type systems. See the POSIX rationale for kill(). A difference not mentioned by POSIX concerns the return value EPERM: BSD documents that no signal is sent and EPERM returned when the permission check failed for at least one target process, while POSIX documents EPERM only when the permission check failed for all target processes.

遵循于

SVr4, 4.4BSD (The killpg() function call first appeared in 4BSD), POSIX.1-2001.

另请参阅

lchown()函数

chown, fchown, lchown -更改文件的所有权

内容简介

#include <sys/types.h>
#include <unistd.h>

int chown(const char *path, uid_t owner, gid_t group);
int fchown(int fd, uid_t owner, gid_t group);
int lchown(const char *path, uid_t owner, gid_t group);

描述

If the owner or group is specified as -1, then that ID is not changed.

RETURN VALUE

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

ERRORS

Depending on the file system, other errors can be returned. The more general errors forchown() are listed below.

标签	描述
EACCES	Search permission is denied on a component of the path prefix. (See also path_resolution(2).)
EFAULT	path yiibais outside your accessible address space.
ELOOP	Too many symbolic links were encountered in resolving path.
ENAMETOOLONG
	path is too long.
ENOENT	The file does not exist.
ENOMEM	Insufficient kernel memory was available.
ENOTDIR
	A component of the path prefix is not a directory.
EPERM	The calling process did not have the required permissions (see above) to change owner and/or group.
EROFS	The named file resides on a read-only file system.
The general errors for fchown() are listed below:
EBADF	The descriptor is not valid.
EIO	A low-level I/O error occurred while modifying the inode.
ENOENT	See above.
EPERM	See above.
EROFS	See above.

NOTES

The prototype for fchown() is only available if _BSD_SOURCE is defined.

CONFORMING TO

4.4BSD, SVr4, POSIX.1-2001.

The 4.4BSD version can only be used by the superuser (that is, ordinary users cannot give away files).

RESTRICTIONS

linkat()函数

linkat - 创建一个文件链接相对目录文件描述符

内容简介

#include <unistd.h>

int linkat(int olddirfd, const char *oldpath, int newdirfd, const char *newpath, int flags);

描述

The linkat() system call operates in exactly the same way as link(2), except for the differences described in this manual page.

If the pathname given in oldpath is relative, then it is interpreted relative to the directory referred to by the file descriptor olddirfd (rather than relative to the current working directory of the calling process, as is done by link(2) for a relative pathname).

If the pathname given in oldpath is relative and olddirfd is the special value AT_FDCWD, then oldpath is interpreted relative to the current working directory of the calling process (like link(2)).

If the pathname given in oldpath is absolute, then olddirfd is ignored.

The interpretation of newpath is as for oldpath, except that a relative pathname is interpreted relative to the directory referred to by the file descriptor newdirfd.

The flags argument is currently unused, and must be specified as 0.

返回值

On success, linkat() returns 0. On error, -1 is returned and errno is set to indicate the error.

错误

The same errors that occur for link(2) can also occur for linkat(). The following additional errors can occur for linkat():

标签	描述
EBADF	olddirfd or newdirfd is not a valid file descriptor.
ENOTDIR
	oldpath is a relative path and olddirfd is a file descriptor referring to a file other than a directory; or similar for newpath andnewdirfd

注意

See openat(2) for an explanation of the need for linkat().

遵循于

这个系统调用是非标准的，但建议列入POSIX.1将来的修订版。

版本

linkat() was added to Linux in kernel 2.6.16.

另请参阅

link()函数

link - 为一个文件的起新名称

内容简介

#include <unistd.h>

int link(const char *oldpath, const char *newpath);

描述

link() 创建一个新的链接（也称为硬链接）到现有文件中。

If newpath exists it will not be overwritten.

This new name may be used exactly as the old one for any operation; both names refer to the same file (and so have the same permissions and ownership) and it is impossible to tell which name was the `original’.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EACCES	Write access to the directory containing newpath is denied, or search permission is denied for one of the directories in the path prefix of oldpath or newpath. (See also path_resolution(2).)
EEXIST	newpath already exists.
EFAULT	oldpath or newpath yiibais outside your accessible address space.
EIO	An I/O error occurred.
ELOOP	Too many symbolic links were encountered in resolving oldpathor newpath.
EMLINK	The file referred to by oldpath already has the maximum number of links to it.
ENAMETOOLONG
	oldpath or newpath was too long.
ENOENT	A directory component in oldpath or newpath does not exist or is a dangling symbolic link.
ENOMEM	Insufficient kernel memory was available.
ENOSPC	The device containing the file has no room for the new directory entry.
ENOTDIR
	A component used as a directory in oldpath or newpath is not, in fact, a directory.
EPERM	oldpath is a directory.
EPERM	The filesystem containing oldpath and newpath does not support the creation of hard links.
EROFS	The file is on a read-only filesystem.
EXDEV	oldpath and newpath are not on the same mounted filesystem. (Linux permits a filesystem to be mounted at multiple yiibais, butlink(2) does not work across different mount yiibais, even if the same filesystem is mounted on both.)

注意

Hard links, as created by link(), cannot span filesystems. Use symlink() if this is required.

POSIX.1-2001 says that link() should dereference oldpath if it is a symbolic link. However, Linux does not do so: if oldpath is a symbolic link, then newpath is created as a (hard) link to the same symbolic link file (i.e., newpath becomes a symbolic link to the same file that oldpath refers to). Some other implementations behave in the same manner as Linux.

遵循于

SVr4, 4.3BSD, POSIX.1-2001 (except as noted above).

BUGS

On NFS file systems, the return code may be wrong in case the NFS server performs the link creation and dies before it can say so. Use stat(2) to find out if the link got created.

另请参阅

listen()函数

listen - 监听套接字上的连接

内容简介

#include <sys/socket.h>

int listen(int sockfd, int backlog);

描述

To accept connections, a socket is first created with socket(2), a willingness to accept incoming connections and a queue limit for incoming connections are specified withlisten(), and then the connections are accepted with accept(2). The listen() call applies only to sockets of type SOCK_STREAM or SOCK_SEQPACKET.

The backlog parameter defines the maximum length the queue of pending connections may grow to. If a connection request arrives with the queue full the client may receive an error with an indication of ECONNREFUSED or, if the underlying protocol supports retransmission, the request may be ignored so that retries succeed.

注意

The behaviour of the backlog parameter on TCP sockets changed with Linux 2.2. Now it specifies the queue length for completely established sockets waiting to be accepted, instead of the number of incomplete connection requests. The maximum length of the queue for incomplete sockets can be set using the tcp_max_syn_backlog sysctl. When syncookies are enabled there is no logical maximum length and this sysctl setting is ignored. See tcp(7) for more information.

返回值

On success, zero is returned. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EADDRINUSE
	Another socket is already listening on the same port.
EBADF	The argument sockfd is not a valid descriptor.
ENOTSOCK
	The argument sockfd is not a socket.
EOPNOTSUPP
	The socket is not of a type that supports the listen() operation.

遵循于

4.4BSD, POSIX.1-2001. The listen() function call first appeared in 4.2BSD.

BUGS

If the socket is of type AF_INET, and the backlog argument is greater than the constantSOMAXCONN (128 in Linux 2.0 & 2.2), it is silently truncated to SOMAXCONN.

另请参阅

_llseek()函数

_llseek - 重新读取/写入文件偏移量

内容简介

#include <sys/types.h>

#include <unistd.h>

int _llseek(unsigned int fd, unsigned long offset_high, unsigned long offset_low, loff_t *result, unsigned int whence);

描述

The _llseek() function repositions the offset of the open file associated with the file descriptor fd to (offset_high<<32) | offset_low bytes relative to the beginning of the file, the current position in the file, or the end of the file, depending on whether whence is SEEK_SET, SEEK_CUR, or SEEK_END, respectively.

It returns the resulting file position in the argument result.

返回值

Upon successful completion, _llseek() returns 0. Otherwise, a value of -1 is returned and errno is set to indicate the error.

错误

标签	描述
EBADF	fd is not an open file descriptor.
EFAULT	Problem with copying results to user space.
EINVAL	whence is invalid.

遵循于

这个函数是Linux特有的，应该在旨在是可移植的程序不被使用。

注意

glibc不提供包装，这个系统调用，它调用 syscall(2).

另请参阅

lseek (2)

llseek()函数

llseek - 重新读取/写入文件偏移量

内容简介

#include <sys/types.h>

#include <unistd.h>

int _llseek(unsigned int fd, unsigned long offset_high, unsigned long offset_low, loff_t *result, unsigned int whence);

描述

返回值

Upon successful completion, _llseek() returns 0. Otherwise, a value of -1 is returned and errno is set to indicate the error.

错误

标签	描述
EBADF	fd is not an open file descriptor.
EFAULT	Problem with copying results to user space.
EINVAL	whence is invalid.

遵循于

This function is Linux specific, and should not be used in programs intended to be portable.

注意

Glibc does not provide a wrapper for this system call; call it using syscall(2).

另请参阅

lseek (2)

lock()函数

afs_syscall, break, fattach, fdetach, ftime, getmsg, getpmsg, gtty, isastream, lock, mpx, multiplexer, prof, profil, putmsg, putpmsg, security, stty, ulimit, vserver - 未实现系统调用。

内容简介

未实现系统调用。

描述

These system calls are not implemented in the Linux 2.4 kernel.

返回值

These system calls always return -1 and set errno to ENOSYS.

注意

Note that ftime(3), profil(3) and ulimit(3) are implemented as library functions.

Some system calls, like alloc_hugepages(2), free_hugepages(2), ioperm(2), iopl(2), and vm86(2) only exist on certain architectures.

Some system calls, like ipc(2), create_module(2), init_module(2), anddelete_module(2) only exist when the Linux kernel was built with support for them.

另请参阅

obsolete (2)

lookup_dcookie()函数

lookup_dcookie - 返回目录条目的路径

内容简介

int lookup_dcookie(u64 cookie, char * buffer, size_t len);

描述

查找的值cookie中的cookie是一个不透明的标识符，唯一地标识一个特定的目录项中指定的目录项的完整路径。给出的缓冲区填入目录项的完整路径。

For lookup_dcookie() to return successfully, the kernel must still hold a cookie reference to the directory entry.

注意

lookup_dcookie() is a special-purpose system call, currently used only by the oprofile profiler. It relies on a kernel driver to register cookies for directory entries.

The path returned may be suffixed by the string " (deleted)" if the directory entry has been removed.

返回值

On success, lookup_dcookie() returns the length of the path string copied into the buffer. On error, -1 is returned, and errno is set appropriately.

错误

标签	描述
EFAULT	The buffer was not valid.
EINVAL	The kernel has no registered cookie/directory entry mappings at the time of lookup, or the cookie does not refer to a valid directory entry.
ENAMETOOLONG
	The name could not fit in the buffer.
ENOMEM	The kernel could not allocate memory for the temporary buffer holding the path.
EPERM	The process does not have the capability CAP_SYS_ADMINrequired to look up cookie values.
ERANGE	The buffer was not large enough to hold the path of the directory entry.

遵循于

lookup_dcookie() is Linux-specific.

可用性

Since Linux 2.5.43. The ENAMETOOLONG error return was added in 2.5.70.

lseek()函数

lseek - 重新读取/写入文件偏移量

内容简介

#include <sys/types.h>
#include <unistd.h>

off_t lseek(int fildes, off_t offset, int whence);

描述

The lseek() function repositions the offset of the open file associated with the file descriptor fildes to the argument offset according to the directive whence as follows:

标签	描述
SEEK_SET
	The offset is set to offset bytes.
SEEK_CUR
	The offset is set to its current location plus offset bytes.
SEEK_END
	The offset is set to the size of the file plus offset bytes.

The lseek() function allows the file offset to be set beyond the end of the file (but this does not change the size of the file). If data is later written at this yiibai, subsequent reads of the data in the gap (a "hole") return null bytes (’\0’) until data is actually written into the gap.

返回值

Upon successful completion, lseek() returns the resulting offset location as measured in bytes from the beginning of the file. Otherwise, a value of (off_t)-1 is returned and errnois set to indicate the error.

错误

标签	描述
EBADF	fildes is not an open file descriptor.
EINVAL	whence is not one of SEEK_SET, SEEK_CUR, SEEK_END; or the resulting file offset would be negative, or beyond the end of a seekable device.
EOVERFLOW
	The resulting file offset cannot be represented in an off_t.
ESPIPE	fildes is associated with a pipe, socket, or FIFO.

遵循于

SVr4, 4.3BSD, POSIX.1-2001.

RESTRICTIONS

Some devices are incapable of seeking and POSIX does not specify which devices must support lseek().

Linux specific restrictions: using lseek() on a tty device returns ESPIPE.

注意

本文档的使用那里的是英文不正确，但维持历史原因。

当与下面的宏转换旧的代码，用于何处替换值：

old	new
0	SEEK_SET
1	SEEK_CUR
2	SEEK_END
L_SET	SEEK_SET
L_INCR	SEEK_CUR
L_XTND	SEEK_END

SVr1-3 returns long instead of off_t, BSD returns int.

Note that file descriptors created by dup(2) or fork(2) share the current file position yiibaier, so seeking on such files may be subject to race conditions.

另请参阅

lstat()函数

stat, fstat, lstat - 获取文件状态

内容简介

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>

int stat(const char *path, struct stat *buf);
int fstat(int filedes, struct stat *buf);
int lstat(const char *path, struct stat *buf);

描述

stat() stats the file pointed to by path and fills in buf.

lstat() is identical to stat(), except that if path is a symbolic link, then the link itself is stat-ed, not the file that it refers to.