Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should use /stretch instead of /testing #54

Open
weaselp opened this issue Jan 27, 2017 · 11 comments
Open

should use /stretch instead of /testing #54

weaselp opened this issue Jan 27, 2017 · 11 comments

Comments

@weaselp
Copy link

weaselp commented Jan 27, 2017

Right now, there is a whole hierarchy of manpages in /testing. It should probably use the codename instead of the alias, i.e. /stretch, so that we don't need to resync and rebuild everything when stretch becomes stable and we get a new testing.

@stapelberg
Copy link
Contributor

The rationale behind the use of suites instead of codenames for testing and unstable is to convey the stability of the URL to the user: it’s quite possible that packages (and hence their manpages) get removed from testing/unstable, but it’s unlikely the same happens with a released version.

With regards to resyncing/rebuilding: wouldn’t we need to do this in either case? When stretch gets released, we’d need to sync/build the buster directory.

@weaselp
Copy link
Author

weaselp commented Jan 27, 2017

Right now, when stretch gets released, you need to build two trees (a new testing and the new stretch). In the other case, you'd have to build only a new buster.

(Also, I think it should be /sid instead of /unstable, but as sid always remains unstable that's less of an issue)

(And people tend to run codenames, not aliases. I.e. people install "stretch", not "testing" right now.)

@stapelberg
Copy link
Contributor

Right now, when stretch gets released, you need to build two trees (a new testing and the new stretch). In the other case, you'd have to build only a new buster.

Makes sense.

To put the effect of the proposed change into perspective: we’re talking about a single debiman run once every two years taking a small number of hours instead of a small number of minutes.

Unless there are other good reasons for this change, I’d prefer it if we prioritized stable URLs over saving a few CPU hours once every two years.

@jcristau
Copy link

The codenames are the stable part, stable/testing/unstable should just 302 to the current codenames.

@stapelberg
Copy link
Contributor

I agree for stable (and don’t care about unstable, as its codename never changes), but I think testing should be treated differently because content can vanish from testing, effectively breaking links.

Putting the rationale into perspective: as long as we can keep our custom 404 handler, packages being removed probably isn’t a big deal, since users will then get a direct link to the same manpage in a different suite — provided the manpage in question does exist in any other suite.

Let me recap the pros/cons I heard in this discussion about the current file location:

  • con: we spend more computational resources once every 2 years
  • pro: links convey their relative stability (codename = stable, suite name = subject to change)

Am I missing anything?

@weaselp
Copy link
Author

weaselp commented Jan 27, 2017

Forget the resources point for a minute. My other point is that I don't run testing. I run stretch. So stretch is the "stable" name to use for my manpages.

(Also, packages also get removed from stable. It's rare, but not unheard of. I therefore think "packages might go away" is not a good reason for differentiating things).

@stapelberg
Copy link
Contributor

Forget the resources point for a minute. My other point is that I don't run testing. I run stretch.

You run stretch. I run testing. I don’t know if either group of users is significantly larger than the other. Is there a way to find out the respective numbers of users?

So stretch is the "stable" name to use for my manpages.

Note that you can use stretch to address manpages; the appropriate redirects are in place. We’re merely talking about what shows up in the URL bar when the page has loaded.

(Also, packages also get removed from stable. It's rare, but not unheard of. I therefore think "packages might go away" is not a good reason for differentiating things).

True. But, given that packages getting removed from stable is a rare event, whereas package removal from testing is an automated process, I maintain there’s a meaningful difference which makes sense to consider.

@anarcat anarcat mentioned this issue Jan 31, 2017
4 tasks
@anarcat
Copy link
Contributor

anarcat commented Apr 21, 2017

You run stretch. I run testing. I don’t know if either group of users is significantly larger than the other. Is there a way to find out the respective numbers of users?

I don't know. But I run stretch. :)

This issue makes fetching raw manpages actually rather slow here. Working on dman in #57, I found it can take more than 2 seconds to fetch a manpage because I go through the redirector instead of hitting the static site directly:

$ time curl -s -L -I http://manpages.debian.org/stretch/manpages-fr/man.fr.gz 
HTTP/1.1 301 Moved Permanently
Date: Fri, 21 Apr 2017 19:35:25 GMT
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Location: https://manpages.debian.org/stretch/manpages-fr/man.fr.gz
Content-Type: text/html; charset=iso-8859-1

HTTP/1.1 302 Found
Date: Fri, 21 Apr 2017 19:35:26 GMT
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Strict-Transport-Security: max-age=15552000
Public-Key-Pins: pin-sha256="m2r9mfIa+ot6bIIC0bCt/7KZ1ych/f8QY3gk9cqUWqs="; pin-sha256="35f/cSfa9he3sUJgp1wZT9gzbI7/zH10hlT/utpEziU="; max-age=5184000
Location: https://dyn.manpages.debian.org/stretch/manpages-fr/man.fr.gz?
Cache-Control: max-age=3600
Expires: Fri, 21 Apr 2017 20:35:26 GMT
Content-Type: text/html; charset=iso-8859-1

HTTP/1.1 307 Temporary Redirect
Date: Fri, 21 Apr 2017 19:35:26 GMT
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Strict-Transport-Security: max-age=15552000
Public-Key-Pins: pin-sha256="3T9ypiCPJdEeUOpKSooGJ1IpFbKsl/ktH0dV/wygJMk="; pin-sha256="xV7KmbTUH6WeUjOC5Tv7gsOpie45AvOH8/vjaIBsBxk="; max-age=5184000
Content-Length: 68
Content-Type: text/html; charset=utf-8
X-Clacks-Overhead: GNU Terry Pratchett
Location: https://manpages.debian.org/testing/manpages-fr/man.7.fr.gz
Content-Language: fr

HTTP/1.1 200 OK
Date: Fri, 21 Apr 2017 19:35:27 GMT
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Strict-Transport-Security: max-age=15552000
Public-Key-Pins: pin-sha256="m2r9mfIa+ot6bIIC0bCt/7KZ1ych/f8QY3gk9cqUWqs="; pin-sha256="35f/cSfa9he3sUJgp1wZT9gzbI7/zH10hlT/utpEziU="; max-age=5184000
Last-Modified: Sat, 17 May 2014 15:14:03 GMT
ETag: "1c50-4f999fafc64c0-gunzip"
Accept-Ranges: bytes
Cache-Control: max-age=3600
Expires: Fri, 21 Apr 2017 20:35:27 GMT
Vary: Accept-Encoding
X-Clacks-Overhead: GNU Terry Pratchett
Surrogate-Key: busoni
Content-Language: fr

0.08user 0.00system 0:01.45elapsed 6%CPU (0avgtext+0avg

There is notoriously no easy way to extract the "testing" string reliably from dman: right now I use lsb_release -c -s to extract "stretch", and I am not sure that i can find a way to extract "testing" from there, because really, testing is like unstable - at most I would be able to extract "testing/unstable", which won't work.

fetching the manpage directly with "testing" lowers the load time from 1450ms (above) to 440ms, so a three-to-fourfold performance increase, because I do not hit the dynamic redirector:

$ time curl -s -L -I https://manpages.debian.org/testing/manpages-fr/man.7.fr.gz
HTTP/1.1 200 OK
Date: Fri, 21 Apr 2017 19:37:38 GMT
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Strict-Transport-Security: max-age=15552000
Public-Key-Pins: pin-sha256="m2r9mfIa+ot6bIIC0bCt/7KZ1ych/f8QY3gk9cqUWqs="; pin-sha256="35f/cSfa9he3sUJgp1wZT9gzbI7/zH10hlT/utpEziU="; max-age=5184000
Last-Modified: Sat, 17 May 2014 15:14:03 GMT
ETag: "1c50-4f999fafc64c0-gunzip"
Accept-Ranges: bytes
Cache-Control: max-age=3600
Expires: Fri, 21 Apr 2017 20:37:38 GMT
Vary: Accept-Encoding
X-Clacks-Overhead: GNU Terry Pratchett
Surrogate-Key: busoni
Content-Language: fr

0.06user 0.00system 0:00.45elapsed 15%CPU (0avgtext+0avgdata 11268maxresident)k
0inputs+0outputs (0major+822minor)pagefaults 0swaps

The original dman version was iterating over all manpage sections, which meant this delay was compounded by an order of magnitude, which meant it could take around 20-30 seconds to load the "man" manpage (because it's in section 7).

Even if only because of this performance issue, I would argue towards using codenames here.

I'll also note that this is similar to the #69 issue, in that we need to determine policy for the redirector...

Thanks!

@stapelberg
Copy link
Contributor

  1. Use https instead of http in the URL to get rid of one redirect
  2. You could hit dyn.manpages.debian.org directly to get rid of another redirect

In my tests, this reduces the time to 0.4s (hitting dyn.manpages.d.o on manziarly) or 1s (hitting dyn.manpages.d.o on cgi-grnet-01). Not sure why the latter is twice as slow; but we can look into that.

@anarcat
Copy link
Contributor

anarcat commented Apr 22, 2017

I'm a bit hesitant in hardcoding a debiman-specific hostname in the script. That thing can live in the wild for a looong time and I'm not sure I want to commit to this internal detail... I did commit to https in dman though.

@stapelberg
Copy link
Contributor

The dyn.manpages.d.o hostname is specified in the opensearch.xml file, so we’d need to setup a redirect if we transition away from it anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants