HackerNews Clone

imtomt 17 hours

Huh. I can’t find the actual... archive. It mentions an AI archive less than 10 sentences in, and has a couple of links, but seems void of any actually archived content.

kennykartman 12 hours

I'm so happy about this. Really. I cannot overstate how much important the internet archive is for all of us.

Animats 15 hours

Oh, good. We need more backups.

The one in Egypt doesn't get updated.

basilikum 17 hours

https://web.archive.org/web/20260509154320/https://interneta...

idovmamane 18 hours

St Gallen has been archiving knowledge for over a thousand years. Now they are archiving AI models before they get retrained out of existence. The location is not a coincidence…

DeadEye2111 22 hours

Very proud of my alma mater town to be a place for this. It’s much needed infrastructure for Europe.

huflungdung 21 hours

[dead]

anant-singhal 9 hours

The uncomfortable part is that “preserving knowledge” sounds universally good until copyright law present themselves.

ukanhaupa 17 hours

cool!

21 hours

arian_ 19 hours

Finally a Swiss account I can afford to open.

7777777phil 16 hours

[dead]

zkmon 20 hours

Anything that is being built today, based on the assumptions about the future that extend into multiple years, is bound to fade away. Because the "future no longer what it used be". What's the envisaged future context and purpose where this would save the world?

feiz45607 22 hours

[flagged]

latenightcoding 16 hours

>> Gen AI ARchive

isn't this a nightmare for privacy

addedGone 16 hours

Let's hope they don't use Google captcha and KYC everyone.

1vuio0pswjnm7 19 hours

https://web.archive.org/web/20260409034921if_/https://intern...

22 hours

17 hours

ok123456 17 hours

Where's the search bar at the top to search the archive?

jrochkind1 7 hours

> collecting the generative AI wave that is currently upon us all.

I don't understand what this means?

bl4ckneon 4 hours

I had the same thought. Does that mean they are archiving a bunch of Ai stuff? Doesn't sound right to me

dopidopHN2 1 hours

I have a related question on the domain archive.org

I've noticed that this domain now host content subject to copyright.

As a example : entire season of startrek "voyager" are randomly hosted there in direct download.

Why? Is that not a liability?

hggh 1 hours

Internet Archive is a library. Libraries host copyrighted content. Libraries are good.

damnitbuilds 1 days

"Its efforts will initially focus on [...] and collecting the generative AI wave that is currently upon us all."

Why would they want to collect the AI wave ?!

But about time the Internet Archive had a US-independent backup.

kinow 1 days

> But about time the Internet Archive had a US-independent backup.

Agreed!

> The Internet Archive Switzerland, online at https://internetarchive.ch/, is a newly-formed Swiss non-profit foundation that will operate independently within its national context.

I think the Wikipedia Editors will have to decide whether they will add it to the existing page. The Operations section is still listing only U.S. data centers: https://en.wikipedia.org/wiki/Internet_Archive#Operations

Anamon 5 minutes

I might be overlooking something, but is a mirror of the Internet Archive even mentioned as a plan anywhere here? It was my first thought after reading the headline, too, but the website only speaks of archiving LLMs and, vaguely, some other collections, but not, for instance, the Wayback Machine.

rbanffy 1 days

I wonder how long does it take to back it up.

userbinator 11 hours

There are some real gems in the sea of slop; and as archivists and historians, they shouldn't moderate.

teew 17 hours

The About Us section states:

> We are a team of change-makers who believe that every helping hand can raise a child and create a better future for them.

Which I found weird. And searching for this phrase yields many site-hits verbatim, which is even weirder. Anyone know what is up with that? Is it some kind of filler text?

Edit: I guess it's from a template, the Contact section is also mumbo-jumbo (address: 123 Fifth Avenue, NY and so on).

malicka 16 hours

That doesn’t exactly instill confidence, honestly…

springtimesun 22 hours

Ah, good, they are also mirroring the page load speed of the internet archive

trvz 22 hours

Typical for something made in St. Gallen. A sensible web developer from Zurich interested in the topic would have created this website in just a single HTML and an optional CSS file.

4ggr0 21 hours

a dev from ZH would've added a blockchain, mobile app and hosted it on an over-allocated kubernetes cluster. 97% uptime and you need a macbook pro so the website doesn't stutter.

shermantanktop 20 hours

A south-of-the-Limmat Migros shopper would use React and Vercel, but still use raw JS Date.

dang 8 hours

Normally we'd reply with "please don't do regional flamewar on HN" but this sounds so good-humored to me that I've canceled the (no doubt well-intentioned) downvotes instead.

Edit: now someone is going to tell me how mean internecine Swiss conflict actually is...

springtimesun 3 hours

As someone who lives in Switzerland, but is not Swiss, I love this kind of thing. It’s an insight into an internal cultural understanding I didn’t get growing up and doesn’t really come up in the conversations I have day to day.

slater 7 hours

There's an entire ditch between the French-speaking and German-speaking parts of Switzerland, filled with Röschti to keep the two apart. True story!

red_admiral 22 hours

Sankt Gallen's more physical archive is worth a visit too: https://www.stiftsbezirk.ch/de/stiftsbibliothek/

woodson 17 hours

Indeed. And the one in Admont, Austria: https://stiftadmont.at/en/about-the-abbey-library/

colinmegill 17 hours

If you are running that thing, and reading this post: just do the right thing and get your own name.

moontear 15 hours

Uhm... https://blog.archive.org/2026/05/06/internet-archive-switzer...

colinmegill 11 hours

I guess I stand corrected, but I maintain it was word salad :)

Vasbarlog 22 hours

Hugged to death? I can’t access the page.

embedding-shape 22 hours

Have you tried just letting it load? Took maybe more than 30 seconds for the page to load for me, but it did load eventually.

Hendrikto 22 hours

Same for me. I cannot access it either.

KomoD 21 hours

Yep, just loading forever.

AndroTux 21 hours

They just want everyone coming from archive.org to feel right at home

pedroneto3 18 hours

I am able too

sixie6e 22 hours

I am able to.

alessandroberna 22 hours

Seems likely, same for me.

22 hours

miki123211 16 hours

IA needs to do what Usenet has done. Have a bunch of mission-aligned but unrelated orgs (under different ownership and distributed around the world) that peer with each other, distribute all the content obtained by any of the orgs to each other, but that have no technical channel nor capability to distribute DMCA complaints and takedown requests.

This is (AFAIK) basically how Usenet piracy works. You send your warez to one provider, and that provider instantly replicates them to all the providers they peer with, recursively, until they eventually reach the entire network. When any of those providers get a DMCA complaint, they remove the offending files (as they're required to do by law), but they don't inform other providers that they've received a DMCA notice, so those providers keep serving those files. This makes it much harder to remove data from the network than it is to add it.

15 hours

topranks 1 hours

Usenet is a distributed policy from the ground up.

It’s centralised in the way you describe now that it’s only used for large files / piracy, but it used to me much more diverse.

cbdevidal 11 hours

I like it in theory but the IA hosts over 175PB of data. Wonder how many other producers could replicate that data.

aryonoco 8 hours

I don’t have hard data to back this up, but I estimate that plenty of main Usenet binary providers easily exceed that.

AnthonyMouse 6 hours

Suppose you don't have ten hosts that each have 175PB of data but rather a million hosts that each have an average of 1.75TB, and therefore the equivalent of 10 full copies. And then something that periodically checks if there is any given subset of the data with too few copies and makes more.

torhacker 3 hours

[dead]

pocksuppet 2 hours

There are only 3-4 providers because the system is spammed with hundreds of terabytes of new data per day by actors seeking to destroy it. They can't moderate the spam because the pirated data is all encrypted so indistinguishable from random data, and because moderation would destroy their pretense of not knowing what content is being posted.

anthk 1 hours

Spam is dead since Google Groups dissappeared and most people just use non-binary newsgroups for high tech/culture talks.

pocksuppet 46 minutes

The binary Usenet is the one that Internet Archive would be like. It receives hundreds of terabytes of new data every day. Most of it is just random bits designed to waste space on the providers.

y3ahd0g 16 hours

So they should use bit torrent.

IMO personal security would only be improved if we diversified away from "the open web".

"Flood the field" with protocols and pre-shared key networks where we have to generate keys together in meat space, make it too expensive to operate the panopticon.

Everyone putting their eggs in the open web basket, gathering in that public commons means all it takes is one bomb on us all, so to speak.

LocalH 15 hours

BitTorrent allows untrusted users (read: industry plants) to connect and slurp down direct IP addresses to swarm participants. It's an unanswered legal question whether low-level uploading (such as the percentages one would get as a "leech", connecting to the torrent and then disconnecting immediately after completion) might fall under "fair use" or "fair dealing" statutes in various jurisdictions.

US-centric here: I feel that uploading a small percentage of a file as a condition of downloading the whole thing may very well fall under fair use - most BT traffic is noncommercial, the portion of the covered work uploaded by "leeches" is very small and probably would be covered by the "30-second" rule often quoted in fair use discussions. The only really arguable point is the "effect on the work's value", but then again an average leech is not uploading enough of the work to have that much of a material effect on the work's value.

fsflover 13 hours

Torrents in I2P allow fully anonymous data exchange.

y3ahd0g 13 hours

Ok private 1:1 wireguard and syncthing or rsync all the way down then

Softlink data to the appropriate mount

The options are endless and tech nerds can 1:1 help friends and family

Locking the knowledge into corporate silos is a huge security risk. The masses should be just as competent and informed so they don't panic

Minority say over the economy and government is just fascism. These people are not deities. They're normal meat and bone

We have processes to replace politicians and workers; we need processes to replace the rich.

Free speech is a circular right and there is no freedom from consequences of speech. They can face consequences too

mafuy 10 hours

In Germany at least, uploading even a single byte of content is illegal. We don't really have Fair Use here; there are only few, very narrow exceptions.

It is also not even required to show that that single byte was uploaded, your IP getting logged as part of the swarm suffices. The burden of proof is on you now. It was much, much worse than in the US.

While all this is technically still true today, a new law a few years ago luckily mostly blocked the path. It was badly needed, because the situation was horribly abused by law firms.

pocksuppet 2 hours

> It is also not even required to show that that single byte was uploaded, your IP getting logged as part of the swarm suffices

What if someone would release software that would connect to random swarms and not upload or download anything? Would they still be criminally liable? You could disguise the purpose by saying it's measuring swarm diversity.

IndySun 6 minutes

[delayed]

defrost 8 hours

In Australia it was determined that an ISP bears no responsibly to respond to allegations of copyright infringement by ISP users.

https://en.wikipedia.org/wiki/Roadshow_Films_Pty_Ltd_v_iiNet...

Of course Telco's can choose to be involved, perhaps accept payment to lookup and snitch, etc. but for the most part a number of ISPs in Au just wash their hands of devoting resources to play connect the dots for others.

numpad0 7 hours

Same in Japan. There's allegedly someone making big bucks going after bittorrent users, straining ISP abuse teams and judicial systems. Interesting that Germany has laws against that.

cortesoft 8 hours

You comment shares bytes with copyrighted content, does that mean you broke the law?

lxgr 3 hours

Context matters.

“Here’s byte 0x67, which is at offset 0x729B1A38 of Copyrighted_Blockbuster.4k.mkv, as requested” is different from “here’s byte 0x67, and it’s the first byte of my text response to your comment”.

simondotau 8 hours

> even a single byte of content is illegal

10010110

Watch out die Deutschen, that’s the first byte of Super Mario Bros.

abc123abc123 33 minutes

Woop, woop, it's the sound of da police!

I heard a rumour that this byte also exists in the Legend of Zelda! No go get em Mr Policeman!

input_sh 22 hours

Relevant blog post: https://blog.archive.org/2026/05/06/internet-archive-switzer...

> Internet Archive Switzerland joins a growing group of mission-aligned organizations, alongside Internet Archive, Internet Archive Canada, and Internet Archive Europe. Together, these independent libraries strengthen a shared vision: building a distributed, resilient digital library for the world.

dang 15 hours

Thanks! Since the submitted URL https://internetarchive.ch/ seems to be down, I've put your link at the top and moved the other to the toptext.

rbanffy 20 hours

Also https://news.ycombinator.com/item?id=48068333, but got little traction.

card_zero 20 hours

I was interested in the others, but https://www.internetarchive.eu is a horrible corporate-looking site with a hero image, a boast about AI, a carousel of news that won't scroll with doing its slow scroll animation, a huge "meet the team" section with mugshots and boring profiles, social media links, a newsletter signup form, and nothing to say where the actual archive is.

ConceptJunkie 11 hours

Somewhere there's a "create a random, soulless, corporate website generator", and these folks used it.

carlosjobim 20 hours

Reading what little information they have there, they aren't a public facing or public serving organization. They seem to provide their services to institutions only:

"working with dozens of European libraries and government agencies to build web collections, Internet Archive Europe prioritized collaboration with cultural heritage organizations to safeguard our collective history."

badlibrarian 16 hours

Internet Archive runs a completely separate version of their site for paying institutional clients. https://archive-it.org/

In a best case scenario, this eventually becomes the replacement for the (lets be honest) absurdly awful archive.org front and backend.

So: an expansion into the EU market. And yes, a honeypot for grant funds, because why not? Good for them.

casey2 17 hours

[flagged]

justusthane 13 hours

I was excited to see there's a Canadian one, but it's just a Wordpress blog?

chorizo 13 hours

They do exist and involved in archiving. Someone reached out to our amateur radio club and offered to archive any documents we might have. They even asked to archive the video recording of one of our monthly meetings.

ferongr 18 hours

Looks like an "organization" tailor made to be awarded EU funds for their "mission".

CPLX 18 hours

Mysteries abound.

vages 17 hours

The .eu branch that card zero criticized seems to be based in Amsterdam, the capital of the Netherlands (an EU member). Or am I missing something?

wongarsu 15 hours

I think people are questioning the "Archive" part, not the "Europe" part of the name

consumer451 20 hours

Stop complaining about availability. Instead, create a solution.

If tpb dot org can still exist ...

At least these people tried. We need a p2p archive solution ASAP. Before our history is entirely re-written.

embedding-shape 19 hours

[flagged]

xp84 19 hours

I really want to reply exhorting you to do the same, so someone else can do the same to me, but this isn’t Reddit…

consumer451 19 hours

My comment is a call to arms.

I have neither the technical nor financial abilities to address this problem.

However, as one of the greatest technical collectives of all time, the users of this website might be capable of doing such a thing.

This is likely the greatest challenge of our time.

19 hours

Intralexical 18 hours

They've been constantly trying to set up P2P solutions. Torrents, DWEB, IPFS, Filecoin, WebTorrent, YJS, whole bunch of tech acronyms. I'm not sure much of it has really caught on?

https://blog.archive.org/tag/decentralized-web/

https://github.com/internetarchive/dweb-transports

Third-party attempt:

https://wiki.archiveteam.org/index.php/INTERNETARCHIVE.BAK

Turns out it's hard! Or maybe just too niche. But you can also help them today, by seeding some of collections that are available as torrents.

12 hours

18 hours

arjie 18 hours

I don’t think the problem lends itself well to decentralization. People have tried to use IPFS et al for this. There were even IA attempts https://github.com/internetarchive/dweb-gateway

No one has cracked this one yet.

14 hours

tylerchilds 18 hours

It has been cracked.

The internet itself is the thing we want.

We’re just constantly in denial that the internet actually does the thing we want it to do.

The internet archive is an excellent demonstration of how to do it.

It’s primarily getting a ragtag group to pool resources and manage them and then gossip with other groups that are doing the same thing.

I’ve spent so much time around the archive that I plainly see a divide between internet people online that can’t connect the dots and internet people in real life that are confused as to why the dots aren’t connecting.

The easiest way to see the dots is to:

1. Stop trying to make money

2. Tally the things that cost money

3. Amortize the upkeep over time

E.g. where do we source resources from, where do we store resources and how do we secure them.

Like HTTP, but for physical materials, not digital.

zbentley 8 hours

That's not what is meant by "decentralization".

None of those things help with the problem of centralization. Centralization isn't limited to moneymaking enterprises, or the modern internet. A centralized server operated by donations for free can just as easily go down, be seized by law enforcement, have its domain or internet service taken offline by government action, and so on.

The internet is not the thing we want (or not sufficient alone), because the internet's resources, and the communication systems between them, are largely centralized.

tylerchilds 4 hours

Yeah, I hear you.

Yeah, them as a single instance is centralized, but if you actually go (show up at 300 Funston on a Friday at 1pm) you can hear about the research into how to replicate and become the resiliency in the network to make it decentralized.

A lot of it is ancient Unix philosophy like “this massive text file is a seekable index” and “rsync does basically most of the heavy lifting” and you’ll quickly realize decentralization is a social problem and not a technical one.

They’re shifting more and better data than the centralized services we’re complaining about— we need better education, not innovation at this current juncture.

The technology exists, the will of the people is lacking in spirit.

tylerchilds 3 hours

Also tacking on that ssh is a social network.

That’s the crucial social layer that powers all of the everything else on the decentralized internet.

Take git as a social platform.

SSH is the social protocol.

GitHub centralized most of the git+ssh net, but that was a choice and we use all these other git+ssh services to not give them a monopoly.

insom 22 hours

That website is really struggling. Very tempting to go to a mirror on archive.org to view it :)

This seems very distinct from Internet Archive in the US, I wonder how separate it is.

Internet Archive Canada (I worked there in 2024) operated like it was a subsidiary, even though I think it was technically an independent organization with some shared directors. Same Slack, same archive.org email domain, etc.

IA.ch has Brewster and Caslon on the board.

I suspect that for the political threats of the current decade the different Internet Archive organisations need to start operating more independently, especially when it comes to funding?

Intralexical 18 hours

Can you share more about your time at the Canadian one? I feel like there was a big hullabaloo about it years ago, but it's not really clear what they do.

insom 17 hours

Not sure what hullabaloo -- they do provide a bunch of services to Canadian institutions (including Libraries and Archives Canada) and they perform physical services like book scanning and in the last few years I believe they are the parent organization for the physical Canadian datacentre _somewhere in BC_.

For my work, I worked in their Archiving & Data Services department, on https://archive-it.org/ -- I didn't know this before I joined, but Internet Archive offers various for-pay services to other cultural institutions, mostly around archiving their stuff or white-labelling playback of archives.

For example https://webarchiveweb.bac-lac.canada.ca/ (the Government of Canada's own Internet Archive) is actually outsourced to ADS within Internet Archive.

On one hand this is neat, as IA have expertise around this, but on the other hand (as a Canadian) I don't like that it's not actually sovereign and that it looks like it's run by our government but that it's not. Tradeoffs, I guess.

crossroadsguy 22 hours

They use Slack? I am kind of surprised. But I am sure on the plus side, that would also mean having to worry about one less uptime.

insom 22 hours

Slack, Zoom and Google Apps (but not for email) - otherwise basically everything was internally ran.

The Slack has (had?) hundreds of guest accounts due to volunteers and allied organizations. It’s an interesting (and cool) institution!

catlikesshrimp 19 hours

[flagged]

jedberg 18 hours

You can't register a ch domain with fewer than 3 characters. It's showing as available because that thing that checks available only looks if it's registered, not if it's allowed.

rayhaanj 14 hours

Technically they’re used for the Cantons so to register a two letter CH subdomain you first need to register a new Swiss Canton!

Barbing 19 hours

Abbreviated internetarchive.ch ?

catlikesshrimp 18 hours

URLs don't admit abbreviations. "url shorteners" are page redirects.

17 hours

adrianmonk 18 hours

They are suggesting that a human used an abbreviation rather than making a typo.

catlikesshrimp 17 hours

"Using abbreviations" of URLs is pointing to *wrong* addresses. A phishing attempt can perfectly use this misconception. This is not even malpractice.

I am not saying the user in question is malicious. I am sorry to repeat myself, but URLs don't admit abbreviations

Barbing 14 hours

Wait was that right in the sibling comment

stackghost 17 hours

Oh, so you admit to purposely playing dense by asking if it was a typo, to obliquely make a pedantic argument about phishing?

I hate this fucking website sometimes.

insom 17 hours

Thanks for trying. I assumed that ia.ch was clearly shorthand for internetarchive.ch but maybe one can't assume anything.

Barbing 14 hours

Assumed same.

Good point we shouldn’t abbreviate URLs in case they get typosquatted? Just raised in a very indirect fashion