Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've noticed that Wikipedia is, now days, extremely fast if I'm not logged in. Just about any article loads near-instantaneously, it might even be faster than Hacker News!

But if I'm logged in its much slower - there's perhaps a second or so lag on every page view. Presumably this is because there's a cache or fast-path for pre-rendered pages, which can't be used when logged in?



Logged out users get served by varnish cache on a server geo-located near you.

Logged in users still get cached pages, but the cache is only part of the page, its not as fast a cache, the servers are not geolocated (they are in usa only for main data center), and depending on user prefs you may be more likely to get a cache miss even on the partial cache.


I wonder if you could fetch the logged-out page first and then re-render client-side what's missing for a logged-in user.

Also, Firefox container tabs might be a nice solution: normally browse in a container where I'm logged out, then to edit, change the tab to my Wikipedia container where I'm logged in.


Well it could be the entire page and then you would get a giant flash or repaints. It is not clear that is any better.

The main things that are different:

* the user links (shows what user you are logged in)

* whether you have a new message notification

* your skin preferences (this will totally change the entire html)

* your language preference (normally this only affects the UI not page content, but some pages it affects page content)

* other misc preferences can affect wikipage content (e.g thumbnail size), although there has been a general effort to avoid adding new ones and remove existing ones that arent used.

* some stuff makes assumptions about logged in users not being varnish (e.g. how csrf tokens work for js based actions), but most of those can be changed if push came to shove.

That said there might be a reasonable argument that much of this isnt needed and it would be a net benefit to kill most of those features.


Yeah, if you use a user setting such as a non-default skin, you should arguably instead install a browser extension or use a client app built on top of PCS (the Page Content Service API): https://en.wikipedia.org/api/rest_v1/#/Page%20content

https://www.mediawiki.org/wiki/Page_Content_Service


Thats what we did a former company. You cache the main page on the edges of a CDN. When logged in, you'll get JSON blob that is also cached on the edges using oauth/jwt header as a dimension for the cache key. JSON blob has the users information like username, name etc and its just filled in. Makes the pages extremely performant since everything is cached on edges and protects your origin servers from load. You could also shove stuff into web storage but private mode for some browsers won't persist the data.


I've seen this backfire when someone breaks the configuration of the cache key. All of a sudden people start getting back usernames and personal information of random users.


This is what client side apps do (e.g. React). They come with their own set of challenges which when not solved correctly turn into downsides. Probably still easier for Wikipedia to do than fixing their backend.


What's a cache miss?


The resource you are requesting is not in the cache, thus requiring a full load from whatever the source is.


The cache takes a swing at the request. If it hits it returns, if it misses it gets caught by the origin.


It's fast, temporary storage, and it's mister to you!


A well established term which is trivial to google. It is when you do a cache lookup and do not find anything.


Let's be nice to the lucky 10000 who get to learn what a cache miss is today.

https://xkcd.com/1053/


Wikipedia uses extensive caching when logged out, and caches much less when logged in to facilitate editing and user account functionally. https://wikitech.wikimedia.org/wiki/Caching_overview


Wikipedia/Wikimedia are quite open about their infrastructure. You can even see the Varnish Frontend Hitrate on their Grafana server: https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldo... (currently 87.6%, min 69.9%, max 92.2%)


Yes, and you can read the documentation which has links to the source code, e.g. here: https://wikitech.wikimedia.org/wiki/Caching_overview#Retenti...


This is fairly typical, and applies to HN too – you’ll see dang suggesting that people log out when there’s breaking news that gets a lot of engagement here.


The difference is that HN stays very fast. Being logged in is more costly for the backend, but doesn't degrade your experience as a user. On Wikipedia you really suffer for being logged in.


When HN is slammed, you just get the "sorry, HN is having problems" message. This is when dang suggests logging out. On many occasions that turns not being able to see the site at all into the site working quickly.


Sure, past some threshold HN goes down, and at that point there is a (big) difference between the logged-in and logged-out experience. My point is that, the rest of the time, there is not, whereas for Wikipedia, the logged-in experience is always significantly worse than the logged-out experience. This encourages people to not log in which is a problem.


Surely there must be a better way to do this, not just for Wikipedia but for all websites with optional logins. It seems like the only difference in the vast majority of cases is to change a single link in the corner of the page from "login" to "the name of my account", which is a silly reason to miss the cache.


this is literally the goal of 20 year old AJAX patterns. progressively enhanced html with javascript on top. absolutely a solvable problem. perhaps wikipedia could look into using Astro or SvelteKit.


If you’re not logged you’re getting the pre-rendered, cached page.

If you’re logged, a number of things have to be recomputed before the page can be rendered.

I run mediawiki at home, the difference is even more stark (i have a small home server).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: