A Gentle Introduction to Decentralized Storage

A blog post explaining Decentralized storage, how it works, and popular dStorage protocols & platforms like IPFS, Swarm, Filecoin, etc.

Hey, fren! gm. ☀️

Recently, I worked on an NFT project that required IPFS for storing the NFT images and metadata. That sparked my interest in learning more about decentralized storage, which has fascinated me for a while now.

So, today we're gonna learn about decentralized storage (let's call it DStorage for short) and some of the most popular DStorage protocols in this blog post.

Let's dive in!


D_D Newsletter CTA

What is DStorage?

As its name implies, DStorage is a storage system that does not rely on a central server or authority.

Unlike a centralized storage system managed and operated by a single centralized entity, a peer-to-peer network of user-operated nodes operates DStorage, each of which stores a copy of the data creating a resilient file storage-sharing system.

The decentralization makes it much more resistant to censorship and attacks than traditional storage systems.


Why Do You Need DStorage?

Today, cloud services like Google Drive, AWS, and Dropbox have been the go-to option for hosting files and websites.

While these services have helped individuals, startups, and large companies from the hassle of managing their storage infrastructure, the centralized nature of such services presents some deep flaws.

One of the many flaws of centralized storage services is that they are easy to censor. If a government or other authority does not want certain information to be out there, they can order the storage providers to remove it and not store it in the future.

Furthermore, the providers can even change the contents of the stored data, resulting in poor information integrity.

Another flaw is that centralized storage services are vulnerable to attack. If hackers can gain access to the servers of a centralized storage provider, they can potentially access and even delete all of the data you stored on those servers.

Since these services have data hosted on centralized servers, having a single point of failure means any outage can result in temporary or even permanent loss of data.

Silicon Valley Data Delete scene

DStorage solves all these problems by providing censorship-resistant, secure, distributed, efficient, robust, cost-effective, and resilient data storage.


Ow Does DStorage Work?

The DStorage model works by distributing a copy or a portion of the entire data across a peer-to-peer network of nodes, each of which is incentivized to store the data.

By storing data redundantly across multiple nodes, the DStorage system ensures data security and accessibility. When you store the same information across multiple nodes, you can still retrieve the data from the remaining storage nodes even if a few nodes go down.

Decentralized Cloud Storage Image Credit: LeewayHertz

What's in It for the Node Operators?

We all have unused space available in our computer and mobile device storage. The DStorage system uses available storage space on the node operator's disk drives, incentivizing node operators to rent out unused storage space on their devices to the DStorage networks.

That was a general overview of how DStorage works. Let's learn about popular DStorage protocols, platforms, and their inner workings.


DStorage Protocols and Platforms

We can consider the Ethereum network a DStorage system for smart contract storage. But when storing large amounts of data, like images and videos, it becomes unfeasible, not to mention gas-inefficient, since this differs from what Ethereum was initially designed for.

So, we need to look at other tailor-made solutions for storage purposes.

1. IPFS

IPFS is a peer-to-peer protocol for storing, accessing, and sharing data in a distributed file system.

What does that mumbo jumbo even mean? Let's take an example.

We all love Twitter (the little bird app is so cool). To access Twitter, you put in the https://twitter.com URL in your browser, and since it points to Twitter's IP address, we get what webpage the server on that address stores.

But if we put Twitter on IPFS, we don't get an IP address. Instead, we get a content identifier (CID) to access Twitter that looks something like this:

/ipfs/QmfExSLtVQwsFJNcN6AaW8DZsrL9CYsbHmxVdeLWkRzuyj

How Does IPFS Work?

There's a problem on the Web2 Internet: You find content by its location.

You want to watch some Netflix? Cool, go to https://netflix.com.

You want to read the Developer DAO blog? Visit https://blog.developerdao.com.

But what if the location of the content changes for some reason? 🤔

Content addressing fixes this problem. By content addressing, every content has a CID based on the data's cryptographic hash that points to data in IPFS.

That means two exactly similar files will have the same CID. A slight difference in the content will generate a completely different CID. IPFS uses the SHA256 hashing algorithm by default to generate CIDs.

IPFS uses content addressing to identify and find content rather than looking at where it's located.

How IPFS Works | Image Credit: Infura Blog Image Credit: Infura Blog

There's so much to these DStorage protocols that each deserves a detailed blog post. For example, IPFS uses Merkle DAGs for representing files and directories. We'll stick to just the basics in this blog post for now.


2. Filecoin

The same team behind IPFS created Filecoin. It's a peer-to-peer network built on IPFS that incentivizes users to rent out unused storage space by rewarding Filecoin's native FIL token.

In Filecoin, users pay for storage space, and anyone who wants to store other users' files can join the Filecoin network and get paid.

How Filecoin works! Image Credit: Filecoin Image Credit: Filecoin


3. Swarm

Swarm is another DStorage protocol that is a part of Ethereum's holy trinity:

  • Ethereum for computing power
  • Whisper for messaging
  • Swarm for storage.

Swarm provides an entirely DStorage infrastructure that allows people from all over the world to become storage providers and get paid.

Its creators designed Swarm to be highly scalable and resilient and to provide a platform for applications requiring high security and censorship resistance.

The idea for Swarm was presented by Gavin Wood, and its development is mainly funded by the Ethereum foundation.

The need for Swarm. Image Credit: Swarm Docs Image Credit: Swarm Docs


Difference Between IPFS & Swarm

While both protocols may look similar from a high level, subtle differences exist when we dive deeper into the inner workings and philosophies behind these protocols.

Some of them are:

  • Swarm's core storage component uses an immutable content-addressed chunk store, while IPFS uses distributed hash tables to find which peers are hosting the content.

💡 A hash table is a database of keys to values. A distributed hash table is one where the table is split across all the peers in a distributed network.

  • Swarm has deep integration with Ethereum for the incentive system, whereas IPFS has no incentive system and utilizes Filecoin to add an incentivization layer.
  • From a development standpoint, IPFS is much further along in code maturity, adoption, and community engagement than Swarm. As a result, IPFS has a lot of content available in terms of documentation. In contrast, Swarm doesn't have a large documentation base (heck, even the info I found while researching Swarm was from third-party blogs 😞).

↗️ There are a lot of similarities and differences between these protocols. For more details, check out this wiki (although a bit dated).


4. Arweave

Arweave is a decentralized storage platform that uses a new data structure called a blockweave. Blockweaves allow Arweave to offer scalable, resilient, and efficient storage.

So, what are Blockweaves? 🤔

We all know that a blockchain is just a chain of linked blocks that contains transaction data. Blockweaves are similar to blockchain in that they are a chain of blocks, but it has storage data connecting to multiple previous blocks from the network.

Image Credit: Arweave Whitepaper Image Credit: Arweave Whitepaper

Blockweaves enforce that the miners provide a 'Proof-of-Access' to old data to add new blocks.

Unlike a more traditional blockchain, where miners spend computing power and electricity and compete to mine a block to earn tokens, the Arweave network encourages miners to store and replicate valuable data to earn tokens.

The Permaweb

The Permaweb, like the traditional web, is a collection of interlinked documents and applications that are stored permanently. The Permaweb sits on top of the Arweave data storage layer.

Since the Arweave network is built on HTTP, just like the traditional web, web browsers have access to the data stored in the network.

Permaweb Layers | Credit: Arweave Credit: Arweave Docs


5. Storj

Storj is a DStorage platform that provides secure, scalable, private, efficient, and S3-compatible decentralized cloud object storage.

Like other DStorage platforms, Storj connects people with unused bandwidth and storage space to those needing cheap, accessible, and private file storage.

Storj uses STORJ, an ERC-20 token, to incentivize people to rent out bandwidth and storage space.


6. Sia

Sia is another popular decentralized cloud storage platform that connects renters (who rent out storage space to host their files & applications) and hosts (who lend their storage space to renters) in a peer-to-peer network. Sia has its own blockchain, and the hosts get rewarded with Sia's own native utility token SiaCoin.

Once the renter uploads their file to the Sia network, it gets split up, encrypted, and sent worldwide. The network ensures the files are always accessible by making multiple copies.

And since the files are split up into multiple pieces and encrypted, they are inaccessible to the hosts.


Conclusion

With the mainstream adoption of centralized storage networks in the past decade and centralized cloud services, like Google and AWS, providing a fast and cheap storage infrastructure to individuals and organizations, threats of poor information integrity, privacy, and censorship have been omnipresent.

The DStorage model addresses the issues of centralized storage by distributing the data across a network of nodes, each of which stores a copy of the data.

This makes it much more resistant to censorship and attacks, with better information integrity and data availability than traditional storage systems.



D_D Newsletter CTA

That's it for now, fren!

If you enjoyed reading this article, please consider the following:

#WAGMI ✌🏻


Originally published at www.cryptoshuriken.com.