区块链系列报告 7 IPFS - Content Addressed, Versioned, P2P File System.pdf

平淡无奇

126

11页

0次

2021-05-08

50墨值下载

IPFS - Content Addressed, Versioned, P2P File System

(DRAFT 3)

Juan Benet

juan@benet.ai

ABSTRACT

The InterPlanetary File System (IPFS) is a peer-to-peer dis-

tributed ﬁle system that seeks to connect all computing de-

vices with the same system of ﬁles. In some ways, IPFS

is similar to the Web, but IPFS could be seen as a sin-

gle BitTorrent swarm, exchanging objects within one Git

repository. In other words, IPFS provides a high through-

put content-addressed block storage model, with content-

addressed hyper links. This forms a generalized Merkle

DAG, a data structure upon which one can build versioned

ﬁle systems, blockchains, and even a Permanent Web. IPFS

combines a distributed hashtable, an incentivized block ex-

change, and a self-certifying namespace. IPFS has no single

point of failure, and nodes do not need to trust each other.

1. INTRODUCTION

There have been many attempts at constructing a global

distributed ﬁle system. Some systems have seen signiﬁ-

cant success, and others failed completely. Among the aca-

demic attempts, AFS [6] has succeeded widely and is still

in use today. Others [7, ?] have not attained the same

success. Outside of academia, the most successful systems

have been peer-to-peer ﬁle-sharing applications primarily

geared toward large media (audio and video). Most no-

tably, Napster, KaZaA, and BitTorrent [2] deployed large

ﬁle distribution systems supporting over 100 million simul-

taneous users. Even today, BitTorrent maintains a massive

deployment where tens of millions of nodes churn daily [16].

These applications saw greater numbers of users and ﬁles dis-

tributed than their academic ﬁle system counterparts. How-

ever, the applications were not designed as infrastructure to

be built upon. While there have been successful repurpos-

ings

, no general ﬁle-system has emerged that oﬀers global,

low-latency, and decentralized distribution.

Perhaps this is because a “good enough” system for most

use cases already exists: HTTP. By far, HTTP is the most

successful “distributed system of ﬁles” ever deployed. Cou-

pled with the browser, HTTP has had enormous technical

and social impact. It has become the de facto way to trans-

mit ﬁles across the internet. Yet, it fails to take advantage

of dozens of brilliant ﬁle distribution techniques invented in

the last ﬁfteen years. From one prespective, evolving Web

infrastructure is near-impossible, given the number of back-

wards compatibility constraints and the number of strong

For example, Linux distributions use BitTorrent to trans-

mit disk images, and Blizzard, Inc. uses it to distribute

video game content.

parties invested in the current model. But from another per-

spective, new protocols have emerged and gained wide use

since the emergence of HTTP. What is lacking is upgrading

design: enhancing the current HTTP web, and introducing

new functionality without degrading user experience.

Industry has gotten away with using HTTP this long be-

cause moving small ﬁles around is relatively cheap, even for

small organizations with lots of traﬃc. But we are enter-

ing a new era of data distribution with new challenges: (a)

hosting and distributing petabyte datasets, (b) computing

on large data across organizations, (c) high-volume high-

deﬁnition on-demand or real-time media streams, (d) ver-

sioning and linking of massive datasets, (e) preventing ac-

cidental disappearance of important ﬁles, and more. Many

of these can be boiled down to “lots of data, accessible ev-

erywhere.” Pressed by critical features and bandwidth con-

cerns, we have already given up HTTP for diﬀerent data

distribution protocols. The next step is making them part

of the Web itself.

Orthogonal to eﬃcient data distribution, version control

systems have managed to develop important data collabo-

ration workﬂows. Git, the distributed source code version

control system, developed many useful ways to model and

implement distributed data operations. The Git toolchain

oﬀers versatile versioning functionality that large ﬁle distri-

bution systems severely lack. New solutions inspired by Git

are emerging, such as Camlistore [?], a personal ﬁle stor-

age system, and Dat [?] a data collaboration toolchain

and dataset package manager. Git has already inﬂuenced

distributed ﬁlesystem design [9], as its content addressed

Merkle DAG data model enables powerful ﬁle distribution

strategies. What remains to be explored is how this data

structure can inﬂuence the design of high-throughput ori-

ented ﬁle systems, and how it might upgrade the Web itself.

This paper introduces IPFS, a novel peer-to-peer version-

controlled ﬁlesystem seeking to reconcile these issues. IPFS

synthesizes learnings from many past successful systems.

Careful interface-focused integration yields a system greater

than the sum of its parts. The central IPFS principle is

modeling all data as part of the same Merkle DAG.

2. BACKGROUND

This section reviews important properties of successful

peer-to-peer systems, which IPFS combines.

2.1 Distributed Hash Tables

Distributed Hash Tables (DHTs) are widely used to coor-

dinate and maintain metadata about peer-to-peer systems.

For example, the BitTorrent MainlineDHT tracks sets of

peers part of a torrent swarm.

2.1.1 Kademlia DHT

Kademlia [10] is a popular DHT that provides:

1. Eﬃcient lookup through massive networks: queries on

average contact dlog

(n)e nodes. (e.g. 20 hops for a

network of 10, 000, 000 nodes).

2. Low coordination overhead: it optimizes the number

of control messages it sends to other nodes.

3. Resistance to various attacks by preferring long-lived

nodes.

4. Wide usage in peer-to-peer applications, including

Gnutella and BitTorrent, forming networks of over 20

million nodes [16].

2.1.2 Coral DSHT

While some peer-to-peer ﬁlesystems store data blocks di-

rectly in DHTs, this“wastes storage and bandwidth, as data

must be stored at nodes where it is not needed” [5]. The

Coral DSHT extends Kademlia in three particularly impor-

tant ways:

1. Kademlia stores values in nodes whose ids are“nearest”

(using XOR-distance) to the key. This does not take

into account application data locality, ignores “far”

nodes that may already have the data, and forces“near-

est” nodes to store it, whether they need it or not.

This wastes signiﬁcant storage and bandwith. Instead,

Coral stores addresses to peers who can provide the

data blocks.

2. Coral relaxes the DHT API from get_value(key) to

get_any_values(key) (the “sloppy” in DSHT). This

still works since Coral users only need a single (work-

ing) peer, not the complete list. In return, Coral can

distribute only subsets of the values to the “nearest”

nodes, avoiding hot-spots (overloading all the nearest

nodes when a key becomes popular).

3. Additionally, Coral organizes a hierarchy of separate

DSHTs called clusters depending on region and size.

This enables nodes to query peers in their region ﬁrst,

“ﬁnding nearby data without querying distant nodes”[5]

and greatly reducing the latency of lookups.

2.1.3 S/Kademlia DHT

S/Kademlia [1] extends Kademlia to protect against ma-

licious attacks in two particularly important ways:

1. S/Kademlia provides schemes to secure NodeId gener-

ation, and prevent Sybill attacks. It requires nodes to

create a PKI key pair, derive their identity from it,

and sign their messages to each other. One scheme

includes a proof-of-work crypto puzzle to make gener-

ating Sybills expensive.

2. S/Kademlia nodes lookup values over disjoint paths,

in order to ensure honest nodes can connect to each

other in the presence of a large fraction of adversaries

in the network. S/Kademlia achieves a success rate of

0.85 even with an adversarial fraction as large as half

of the nodes.

2.2 Block Exchanges - BitTorrent

BitTorrent [3] is a widely successful peer-to-peer ﬁleshar-

ing system, which succeeds in coordinating networks of un-

trusting peers (swarms) to cooperate in distributing pieces

of ﬁles to each other. Key features from BitTorrent and its

ecosystem that inform IPFS design include:

1. BitTorrent’s data exchange protocol uses a quasi tit-

for-tat strategy that rewards nodes who contribute to

each other, and punishes nodes who only leech others’

resources.

2. BitTorrent peers track the availability of ﬁle pieces,

prioritizing sending rarest pieces ﬁrst. This takes load

oﬀ seeds, making non-seed peers capable of trading

with each other.

3. BitTorrent’s standard tit-for-tat is vulnerable to some

exploitative bandwidth sharing strategies. PropShare [8]

is a diﬀerent peer bandwidth allocation strategy that

better resists exploitative strategies, and improves the

performance of swarms.

2.3 Version Control Systems - Git

Version Control Systems provide facilities to model ﬁles

changing over time and distribute diﬀerent versions eﬃciently.

The popular version control system Git provides a power-

ful Merkle DAG

object model that captures changes to a

ﬁlesystem tree in a distributed-friendly way.

1. Immutable objects represent Files (blob), Directories

(tree), and Changes (commit).

2. Objects are content-addressed, by the cryptographic

hash of their contents.

3. Links to other objects are embedded, forming a Merkle

DAG. This provides many useful integrity and work-

ﬂow properties.

4. Most versioning metadata (branches, tags, etc.) are

simply pointer references, and thus inexpensive to cre-

ate and update.

5. Version changes only update references or add objects.

6. Distributing version changes to other users is simply

transferring objects and updating remote references.

2.4 Self-Certiﬁed Filesystems - SFS

SFS [12, 11] proposed compelling implementations of both

(a) distributed trust chains, and (b) egalitarian shared global

namespaces. SFS introduced a technique for building Self-

Certiﬁed Filesystems: addressing remote ﬁlesystems using

the following scheme

/sfs/<Location>:<HostID>

where Location is the server network address, and:

HostID = hash(public_key || Location)

Thus the name of an SFS ﬁle system certiﬁes its server.

The user can verify the public key oﬀered by the server,

negotiate a shared secret, and secure all traﬃc. All SFS

instances share a global namespace where name allocation

is cryptographic, not gated by any centralized body.

Merkle Directed Acyclic Graph – similar but more general

construction than a Merkle Tree. Deduplicated, does not

need to be balanced, and non-leaf nodes contain data.

of 11

50墨值下载

区块链

关注

评论