We start our jurney in ancient history 285 BC. The probable moment where people created biggest library of ancient times,The Great Library of Alexandria. We believe that it aggregated from 40 000 to 400 000 scrolls. The library was great creation of human civilisation, it had the idea that human knowledge should be centralized to push civilisation forward. The idea that anyone can arrive to Alexandria read and learn from manuscripts. The idea how noble and pure caused one of the biggest knowledge lost when library was accidentally burned during civil war.
Let's move fast forward to Mainz Germany in 1406 to meet Johannes Gutenberg the inventor of printing press and leader of the revolution of the information. We don't have to write information by hand and we just manage compress 40 0000 - 400 000 scrolls to 10 000 - 100 000 books. But looks like it wasn't enough so we created libraries - the biggest library according to wikipedia is British Library containing about 170 - 200 million books that is roughly 2 000 libraries of alexandria. But it wasn't single one on thise world (probably that's why it didn't burn lol).
Since books become easier and faster to print there were multiple libraries that distributed them. And now probably each country have it's own national librariy and some rich countries have many large libraries that are believed to be biggest in the world. Today every town have library, some towns have multiple libraries.
But it wasn't enough to quench thirst of knowledge. Libraries are simple yet difficult to use, like you have to go to the library, register to borrow the book and return it before certain amount of time and to go back you have to go to library and repeat the process. The creation of book become cheap enough for selling them and book stores arrived, where you can own books but storing 100 000 books that was equivalent of library of alexandria still is a big challenge and a big expense. A stack with 10 000 volumes calculating 100 volumes in every square meter would take 100 square meters space.source Small compared to library of alexandria but still big compared to living space we have in our houses.
And now we fast forward to the 80s of the last century, the moment when books were replaced by floppy disks and libraries by the personal computer, we don't have to buy a book anymore, we can just copy it to floppy disk and move it from one computer to another, or copy from one disk to other, keeping it in two places. Some may called it piracy cause same book is in two places at the same time, some might call it a book superposition. Still you had to find and go to person that posessed book and copy it.
And finally the information revolution happened in the 90s the internet, you no longer had to go out. Just dial up to the information network, send and copy whatever you find. No copyright act no regulation nobody told you what to do, where to do it, what to read what to copy, what to send and to whom. It wasn't long when corporations spotted that they are loosing profit. In 1998 DMCA happened to bring back power to the imperium of the dark side... to white collars. Other countries more or less followed those rules.
Fortunatelly the thirst of profit was very big. The Moores Law kept going up and our storage on personal computers keep increasing. The internet was growing and growing. While one websites were established others were abandoned and lost like Library of Alexandria. Internet was and is like ancient times, everything is centralized, you read the news you don't own, the're owned by the server companies. Fast forward to 2000s and server companies become cloud companies that own content. More centralisation and more corporate establishment over web content. Forums become social networks, with few of them keeping most of the content of the world for themselves.
Instead of keeping 100 000 books privately on our computer that would take 100GB (if we assume avarage book is 1MB in size) we are selling our soul for access to petabytes of data with a touch of keyboard, computer mouse, game controller or even tv remote. We live in ancient times of the internet right now, we own nothing of the content we see and we are happy because we sit in the front of the index in the greatest library of our civilisation and we can pick whatever we want. We have so much to see we can't keep up with information. Our brains slowly explode to depression and more and more of us live alone just to keep up with the world light speed 1 second per day instead of old and boring 24 hours per day. We don't sleep, we don't exercise, we eat sleep and connect to the network.
Going that way in the near future without internet connection countries will be paralysed, people won't be able to pay for goods or maybe even get out from their homes. We're creating single point of failure and majority of us are all fine about it. Just because it's easy, we don't have to do anything except to connect to the world. We are born and die everyday once we connect and disconnect. We die when we get up and walk and born again when we seat down and look at the bright screen that is only bright side of our lives while all world around goes on fire. We try to escape to virtual reality instead of facing trught that the planet is dying, that we breath and eat plastic. We slowly transform into barbie and ken.
We're happy by owning nothing, just reading and rate everything we see. Movies and music owned by streaming providers, books owned by ebook readers. It's convinient to access great library of alexandria with a single click. We rate something 1/10 and we feel we changed the world. We post something and got 1000 likes we feel we're not alone. We become slaves of the computer algorithms. Food in the restaurant doesn't taste if it's not ranked 4/5. We start to pay more and more for water and soon most of us will die from thirst.
When internet revolution started storage was expensive. It was so expensive that creating computer with 100 GB in storage back then would cost millions of dollars. The bandwidth was expensive and slow. But over the years we managed to improve it and we missed that now instead of sending text we're now sending video, instead of sending emails we send messages with audio / video and images. Instead of living our lives we're streaming our lives to the cloud.
A few words about the size of the text content of the internet.
The biggest open source internet content aggregator common crawl aggregated in 2022 380 TiB of data that is 380 1tb hard disk drives of data. If we assume each person have at least 1tb of disk space on their computer, most of meaningfull content written to this day is like a meeing with 380 people on a conference or like going to the big cinema. Wikipedia english data dump 20.5 GB. Stackoverflow posts and post history 50 GB. And I'm just writing about centralized storages. Who needs all the data from centralized storage except companies that want to own that content ? Who needs all the books from library of alexandria if I'm only interested in one topic ?
How many books one person can read during lifetime ?
Lenstore gave the test to 2000 people and found that the average participant took 101 seconds to complete the passage. If a person reads for 30 minutes a day at that speed, they can get through 33 books a year (assuming book lengths average out to 90,000 words).
if we assume we just multiply that by 16 and read 8 hours a day that's 528 books. Multiply that by 60 year it's 31 380 books. That's 31gb of storage you can read. Funny coincidence that WizardLM 30B parameters LLM quantised to 8 is about 32GB. Probably that's why Chat GPT is better in answering questions than most of humans. We just can't aggregate that amount of knowledge. We reached our limits as humans and are now creating artificial intelligence to get our knowledge structurized yet we still need to verify it manually.
You think internet is big but it's one residential area scale of content today. Internet today is limited to one cinema. It's like one great library of alexandria. Internet scale is nothing compared to content people store on their computers. I'd say it's only fraction of what we upload goes to the cloud. that's like 1% or less of content people create. You say I store documents, pictures and all my content in the cloud. Are you ? You're in the lucky 10% of the world. Just to say WEF posted that "Fewer than 1 in 5 people in the least developed countries are connected to internet" and you can read that "Globally, only just over half of households (55%) have an internet connection."
But we still bet on cloud. Biggest content creation company Adobe Inc is promoting creative cloud. Biggest document creation company Microsoft is promoting their cloud and create documents in cloud. They just closed down their market for 50% of people. Looks like those are poor people that can't afford the price increase for their products. The cloud is still expensive so are the margins. The profits of those other 50% or I'd say 10% are so big they don't care anymore. They'd be able can send people to mars back and forth couple of times a year just by using fractions of those margins and they own all the content so the junkies must pay for their products or they are left with nothing.
Someone may argue we invented web3 and cryptocurrencies. What is web3 ? It's basically allowance to federate content. Just like book printing allowed us to create town libraries. Let's say facebook is a great library of alexandria then web3 is technology for establishing town libraries. That's great but that's nowhere near the content distribution and knowledge preservation. The libraries would come and go and if you want to keep your library content you have to have your own library or move content from one library to other.
Just to make things clear the biggest to my knowledge web3 content storage technology is ipfs that is in short calculating unique identifier (hash) for content and keeping it on the "node". It's stil transport encrypted not content encrypted and it need's internet access to get the content. All the nodes needs to be exposed to public. Let's take the cryptocurrencies - ex. bitcoin. That is still centralized cause of agreement protocol that keeps accepting new hashes. The other ones like social networks mastodon - doesn't hide that they are not distributed but just federated, same goes for peertube etc. All that means you still need internet access and internet domain to access content stored on those portals.
We have webtorrent you say. I say webtorrent is just proxy to stream torrents over http and websocket instead of using desktop applications. It's amazing technology from technology point of view where you translate the tracker and p2p to http but it's more centralized that it's big brother bittorent protocol.
The biggest problem I see today is that we underestimate the computing power and storage that we have on our local machines - that storage and computing is couple thousands of magnitudes bigger than computing power of any cloud. Yet there is no technology that allows us to use that power. They are comming like C++ Tensor library for machine learning ggml that allows us to have our own chatgpt without internet. The fediverse where you create private communities. But they're still young.
So are we doomed ? Probably, but at least when you look back at the begining at the article and look at web3 it's not that bad because. We probably just started moving in the right direction. If we slowly move forward and push great library of alexandria to local libraries before Julius Caesar decides to burn all of the content people created due to civil war we're going to be fine.