I've noticed that Wikipedia is, now days, extremely fast if I'm *not* logged in....

bawolff · on May 29, 2023

Logged out users get served by varnish cache on a server geo-located near you.

Logged in users still get cached pages, but the cache is only part of the page, its not as fast a cache, the servers are not geolocated (they are in usa only for main data center), and depending on user prefs you may be more likely to get a cache miss even on the partial cache.

tuukkah · on May 29, 2023

I wonder if you could fetch the logged-out page first and then re-render client-side what's missing for a logged-in user.

Also, Firefox container tabs might be a nice solution: normally browse in a container where I'm logged out, then to edit, change the tab to my Wikipedia container where I'm logged in.

bawolff · on May 30, 2023

Well it could be the entire page and then you would get a giant flash or repaints. It is not clear that is any better.

The main things that are different:

* the user links (shows what user you are logged in)

* whether you have a new message notification

* your skin preferences (this will totally change the entire html)

* your language preference (normally this only affects the UI not page content, but some pages it affects page content)

* other misc preferences can affect wikipage content (e.g thumbnail size), although there has been a general effort to avoid adding new ones and remove existing ones that arent used.

* some stuff makes assumptions about logged in users not being varnish (e.g. how csrf tokens work for js based actions), but most of those can be changed if push came to shove.

That said there might be a reasonable argument that much of this isnt needed and it would be a net benefit to kill most of those features.

tuukkah · on May 30, 2023

Yeah, if you use a user setting such as a non-default skin, you should arguably instead install a browser extension or use a client app built on top of PCS (the Page Content Service API): https://en.wikipedia.org/api/rest_v1/#/Page%20content

https://www.mediawiki.org/wiki/Page_Content_Service

adrr · on May 29, 2023

Thats what we did a former company. You cache the main page on the edges of a CDN. When logged in, you'll get JSON blob that is also cached on the edges using oauth/jwt header as a dimension for the cache key. JSON blob has the users information like username, name etc and its just filled in. Makes the pages extremely performant since everything is cached on edges and protects your origin servers from load. You could also shove stuff into web storage but private mode for some browsers won't persist the data.

underwater · on May 30, 2023

I've seen this backfire when someone breaks the configuration of the cache key. All of a sudden people start getting back usernames and personal information of random users.

undefinedzero · on May 29, 2023

This is what client side apps do (e.g. React). They come with their own set of challenges which when not solved correctly turn into downsides. Probably still easier for Wikipedia to do than fixing their backend.

flangola7 · on May 29, 2023

What's a cache miss?

PKop · on May 29, 2023

The resource you are requesting is not in the cache, thus requiring a full load from whatever the source is.

Rapzid · on May 29, 2023

The cache takes a swing at the request. If it hits it returns, if it misses it gets caught by the origin.

stavros · on May 30, 2023

It's fast, temporary storage, and it's mister to you!

jeltz · on May 29, 2023

A well established term which is trivial to google. It is when you do a cache lookup and do not find anything.

Paul-Craft · on May 29, 2023

Let's be nice to the lucky 10000 who get to learn what a cache miss is today.

https://xkcd.com/1053/

starkparker · on May 29, 2023

Wikipedia uses extensive caching when logged out, and caches much less when logged in to facilitate editing and user account functionally. https://wikitech.wikimedia.org/wiki/Caching_overview

jonatron · on May 29, 2023

Wikipedia/Wikimedia are quite open about their infrastructure. You can even see the Varnish Frontend Hitrate on their Grafana server: https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldo... (currently 87.6%, min 69.9%, max 92.2%)

tuukkah · on May 30, 2023

Yes, and you can read the documentation which has links to the source code, e.g. here: https://wikitech.wikimedia.org/wiki/Caching_overview#Retenti...

robin_reala · on May 29, 2023

This is fairly typical, and applies to HN too – you’ll see dang suggesting that people log out when there’s breaking news that gets a lot of engagement here.

remram · on May 29, 2023

The difference is that HN stays very fast. Being logged in is more costly for the backend, but doesn't degrade your experience as a user. On Wikipedia you really suffer for being logged in.

electroly · on May 29, 2023

When HN is slammed, you just get the "sorry, HN is having problems" message. This is when dang suggests logging out. On many occasions that turns not being able to see the site at all into the site working quickly.

remram · on May 29, 2023

Sure, past some threshold HN goes down, and at that point there is a (big) difference between the logged-in and logged-out experience. My point is that, the rest of the time, there is not, whereas for Wikipedia, the logged-in experience is always significantly worse than the logged-out experience. This encourages people to not log in which is a problem.

kibwen · on May 29, 2023

Surely there must be a better way to do this, not just for Wikipedia but for all websites with optional logins. It seems like the only difference in the vast majority of cases is to change a single link in the corner of the page from "login" to "the name of my account", which is a silly reason to miss the cache.

swyx · on May 29, 2023

this is literally the goal of 20 year old AJAX patterns. progressively enhanced html with javascript on top. absolutely a solvable problem. perhaps wikipedia could look into using Astro or SvelteKit.

znpy · on May 29, 2023

If you’re not logged you’re getting the pre-rendered, cached page.

If you’re logged, a number of things have to be recomputed before the page can be rendered.

I run mediawiki at home, the difference is even more stark (i have a small home server).