Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Each socket should have two file descriptors (cr.yp.to)
120 points by riobard on July 21, 2013 | hide | past | favorite | 48 comments


A small compendium of errors...

For starter, a | b |c doesn't create two file descriptors. It creates four, two per pipe. Then there is a problem with the central argument. For regular files, and in fact all method of creating file descriptor, except pipes, opening in RW mode creates a single descriptor. The reason for pipe() special behaior is two pronged:

1. The raison d'être of pipes is for sequential communication between producer/consumer processes. There is no other reason for their design. As such, it made sense to break the convention of a single FD.

2. At the time of their design, the 70s, memory footprint was a crucial part of any design. Thus, sharing buffers between the producer and consumer FD was primordial, and the best way to make it happen is to create the FD at the same time.

The other problem is that the pipe analogy for TCP is wrong. Which client / server protocol over TCP ever had single direction data transfer? Close to 100% of all TCP usages are between a single client and a single server, communicating both ways.

The design for socket is practical for their intended purpose. Trying to shoe-horn an artifical problem an a design will, unsuprinsingly, no yield proper result.


He said it creates two pipes, and later that each pipe creates two descriptors.

He never said there was no reason that sockets behaved differently than other descriptors, only that the reasons didn't warrant the broken abstraction.

And finally, I think you missed the place where he claims the abstraction breaks down. He's not complaining that you use "open" to get a file's descriptor and "pipe" to get a pipe's two descriptors, any more than it's weird to get a socket descriptor for "socket". He's saying that once you have the descriptor, the single file descriptor for sockets forces the OS to implement a socket-specific call simply to get a FIN sent properly.

The reality is that sockets could be represented not with integer descriptors but with character arrays or structs and the Unix interface wouldn't be any less incoherent than it is now with accepting descriptors, a select call that works for sockets and not files, setsockopt, shutdown, recv, connected Unix domain sockets, ioctls, and so on and so forth.


this is something I hate about the "unix design philosophy". when you start digging deeper, you realize that "everything is a file" is true, except for network connections, and A/V devices, and input peripherals, and really that the only things that actually are files, are files. everything else is allllmost a file except in one little weird way, or, many big big ways...


That's not a problem of the philosophy, it's a problem of its implementation. And that is caused by the myriad people and entities to have been responsible for the chaotic evolution of the various parts that make up modern Unix systems.

Design by effectively nobody can be at least as bad as design by committee.

As others pointed out, Plan 9 is much closer to a proper realization of the everything is a file design philosophy. Unsurprising, since it was grown entirely within Bell Labs under the supervision of the original Unix team. It's unfortunate it hasn't reached a point of being ready for mass adoption.


> As others pointed out, Plan 9 is much closer to a proper realization of the everything is a file design philosophy. Unsurprising, since it was grown entirely within Bell Labs under the supervision of the original Unix team. It's unfortunate it hasn't reached a point of being ready for mass adoption.

I don't think it's unfortunate, having used it for a month straight I'm not convinced at all it's the right thing, and in fact, is exactly the opposite direction from where I think we should be going (as far away from the filesystem as possible).


Because you have a technical argument, or because of the poor user experience that a research platform provides?


My technical argument is that the filesystem is too complicated an abstraction for questionable benefit. Yes, sockets are really cool as files, but that's not any more useful outside of the unix shell.

I suspect that Plan 9 could be a good dev environment, but is not a great deployment platform—it just doesn't offer enough better than developed *NIXes to port code over.


Plan 9 was a great OS. And it did fix a lot of these problems (although I don't know if it fixed this particular one).

But it turns out that what we have is good enough, and it never gained traction.


I don't entirely agree with that characterization. I believe Plan 9 offers fundamental technical advantages that would interest a small but significant and sustainable population of users and developers, and that it's not the inertia of other platforms that it can't overcome, but itself. Its user experience is utter crap, a combination of poor/lack of design and annoying idiosyncrasies by certain of its developers.

It's a research platform, and no real effort has been made to turn it into something more practical. If someone were to pick it up and turn it into something people wouldn't hate using, I think we'd see a community on par with NetBSD or OpenBSD.


Agree completely; the problem is that a clever-sounding slogan like "everything is a file" is so appealing to people that they want to believe it even if it's not actually true. I've run into this before: https://news.ycombinator.com/item?id=2397039

Here's another similar example of how "everything is a file" falls down that I wrote about in 2005 when complaining about terminals (http://www.advogato.org/person/habes/diary/6.html)

    Then there's the whole mess of pseudo-terminals.
    If you are like me, you might wonder at first why
    pseudo-terminals are necessary. Why not just fork
    a shell and communicate with it over stdin/stdout?
    Well, the bad news there is that UNIX
    special-cases terminals. They're not just
    full-duplex byte streams, they also have to
    support some special system calls for things like
    setting the baud rate. And you might think that a
    change of window size would be delivered to the
    client program as an escape sequence. But no, it's
    delivered as a signal (and incidentally, sent from
    the terminal emulator as an ioctl()).

    Of course, you can't send ioctls over a network,
    so sending a resize from a telnet client to a
    telnet server is done in a totally different way
    (http://www.faqs.org/rfcs/rfc1073.html)


Let's not confuse a bad implementation with a bad philosophy, as a lot of these comments seem to be doing.

Also, "everything is a file" is still true even when some files have additional operations possible on them. It's misleading to say otherwise.


I'm not even sure that "everything is a file" is particularly appealing as a philosophical position.


so I think that in an abstract world of arbitrary reality, stating "everything is a file" is okay but we have a lot of evidence that implementing something where everything actually is a file is just too hard. If the philosophy is good, why do its adherents produce so much that is crap?


Well, for the PTY example above, the reason is backward compatibility. Eventually we need to decide to rid ourselves of the limitations that backward compatibility (with, e.g., bash) brings and clean up the implementation.


Plan 9 has addressed some of the leaks in this abstraction, many things are significantly more "file-like".

Still, it's possible that open/read/write/seek is just not the right abstraction in all cases. http://yarchive.net/comp/linux/everything_is_file.html is interesting reading.


I think it's interesting more in the sense of understanding what Unix is today. It doesn't really read as an argument against a Plan9-like realization of everything-is-a-file.

I'd summarize those posts in two main points:

1) Linux isn't a research project.

2) Bringing a Plan 9-like realization of everything-is-a-file into modern Unix really just creates an even uglier chimera than Unix already is.


For a simpler and weirder example, consider the TCP accepting socket. I learned socket programming in the early '90s and accepting sockets were just the way things worked; I never questioned them. A few years ago I had the occasion to teach someone socket programming, and accepting sockets were a giant W-T-F for me as I tried to explain how things worked.


I don't think "everything is a file" was ever supposed to mean "everything works just like a disk file", or even "everything has a name in the filesystem". It really meant "every kernel-managed resource is accessed through a file descriptor". Of course, even this formulation isn't always true (network interfaces and processes are the classic exceptions), but it's a true in a lot more cases. In this sense, "file descriptors" are really just the UNIX name for what Windows calls a "handle".

This is still quite a good paradigm to follow - the semantics for waiting on, duplicating, closing and sending file descriptors to other processes are generally well-defined and well understood. For example, Linux exposes resources like timers, signals and namespaces through file descriptors.


Look into Plan 9's take on the file metaphor. I'm not suggesting they got everything right, but the concept really can be made more general than plain old Unix makes it seem.


Well you're right, not everything is a file. We can't treat a socket like a file because the networks is less reliable/fast/whatever than the file system. You can't treat a pipe just like a file because it needs to have someone reading it if you want to write to it, and there are special files accepting weird ioctls because they have weird capabilities that need to be exploited, etc. But still, all these do have things in common, mainly they can be read from and written to, and the 'file' is an abstraction that represent just that. To me this is very much like an object oriented concept. This is polymorphism applied to os resources. If several resources have part of their interfaces in common, then exposing it as a higher level abstraction can help simplify programs that then don't have to worry about type specific details if they don't need to. Anyway even if nothing really is a file, I often find it useful to think about everything as one, as long as I don't forget it's actually not when the abstraction stops being valid. And I'd guess it also helps design and implement the system itself.


I thought 'everything is a file' mostly referred to the fact that you can read/write to every file descriptor, no matter if the file descriptor points to a socket, a pipe, etc., or an actual file.

Before that, you had completely different syscalls depending on the device you were using.


Yup. For example ioctls. You don't do that to a file.


I just don't see the problem that djb is highlighting. To me the crucial mistake comes in this sentance; "When the generate-data program finishes, the same fd is still open in the consume-data program, so the kernel has no idea that it should send a FIN." generate-data and consume-data should NEVER share a fd; the two pipes in the "same machine model" are two seperate sets to fds (the pipe() returns both ends of the fd in one call). Likewise the TCP model should use two separate (sets of)sockets. shutdown()'s real use is for poorly implemented protocols where the server has no real time way of initiating a control message to the client apart from "abort"ing the connection; some protocols only allow the client to initiate sending a message. Also note that one end of the connection sending a FIN doesn't preclude the other end from sending more data.


I'm going to take a stab at interpreting what this article is about.

First, the missing background:

djb likes unix.

the unix philosophy is to compose small programs together to solve problems.

djb's own programs illustrate this really well. They are all small, focused tools. This allows each program to focus on their particular task or domain.

The primary method of composition in unix is the pipe in a shell. Each pipe has two descriptors. One for read and one for write.

It is very easy to create a pipe and handle pipe IO.

The article:

At some point, djb wanted to have some programs live on the network. This expands the composition beyond a single machine. If you just try to treat a socket as a standard pipe, you encounter the problem he describes.

Any program utilizing a pipe requires two file descriptors. If someone built a trivial 'netpipe', they could just 'dup' or 'dup2' the socket file descriptor to make it look like a normal pipe. The problem with that is now the socket won't close until both fds are closed. This means the remote end won't detect EOF. This means the 'netpipe' program has to be very clever in order to detect EOF and do a proper close on both so the remote can see the last bytes and then EOF.


The original doesn't appear to be dated, but archive.org[1] has the same thing going back to at least 2003.

[1] http://web.archive.org/web/20030805143958/http://cr.yp.to/tc...

Edit: HTTP Last-Modified header looks like it might be right:

    Last-Modified: Tue, 10 Jun 2003 23:44:11 GMT


I think this page may predate archive.org, and that it may have lived at a different location on his site. I remember it as part of the ucspi documentation, which (I think) just barely predates tinydns.


Hey, I got a better idea. Let's give each socket a "human-friendly" name. Translating back and forth to the underlying numerical representation should be easy enough.[1]

Heck, we could even create a centralized system for managing our new namespace. OMG, maybe we could charge people money for the names? Yes! We're rich!

And the result: Hundreds of millions of "parked" domains serving up cheap advertising. Simply brilliant!

1. See Hobbit's comments in netcat source code for a differing opinion.

I often wish that people like djb or Hobbit (=low tolerance for nonsense) had designed the systems that we are now stuck with.

Though they are only applications, netcat and ucpsi have aged well and remain a pleasure to use.


Nobody is forcing you to use DNS.


Ha. That is debatable. Define "force".

At the very least, I'd say there is strong coercion. If not in favor of using names, then certainly in favor of deprecating the use of IP and port numbers (e.g., for email). Lemme guess, now you'll say "No one is forcing you to use email."


   s/ucpsi/djb'"'"'s & applications/


s/ucpsi/ucspi/g


Hmmm... use two TCP sockets?


You've missed the point. It's not about "I can't do things with TCP". It's that BSD's networking implementation, which modern Unix networking is based on, broke the "everything is a file" philosophy of Unix. Instead of using the existing file descriptor interface, they created a new socket interface, which is logically 2 file descriptors. However, it's not actually 2 file descriptors, so you now need a bunch of device-specific code.


As others have said, it wasn't a valid point to begin with. A file open for reading and writing is too "logically 2 file descriptors", yet it's a single fd, just like a socket.


not to argue, but you can open an fd for reads and writes on unix on a normal file. A socket is not logically 2 file descriptors.


Absolutely true. Too late to fix now though.


That seems to be the general theme with a lot of network and systems programming. :(


I don't quite understand the appeal of everything is a file philosophy. Neither everything is an object for that matter. It makes the world look like this:

http://www.youtube.com/watch?v=HPeattKV74A


Now let's talk about how half the higher layer protocols in the world that use TCP should probably be using a reliable datagram protocol...


It doesn't matter, with the way IP works any reliable datagram protocol would contain an implementation of 90% of TCP.

The more fragmented an IP packet gets along the way the less likely it is it will reach destination, so you have to take into account path MTU size and split your datagrams accordingly. You also want to send as many datagrams as you have available in as few IP packets as possible and you want to do slow start for the same reasons TCP does it.

Result: your datagrams need to become a stream of bytes to be handled efficiently by any transport protocol sitting on top of IP.


Someone please correct me if I am wrong, but is this something that Plan9 was solving? Everything (including sockets) being treated as a file?


(As I understand it) Plan9's file-system abstracted over network connections, yeah. You'd bind files & folders to your view of the filesystem, and that binding could cross network boundaries.

EDIT: theoh says it better https://news.ycombinator.com/item?id=6080324


tl;dr - this is about 'leaky abstractions'

http://www.joelonsoftware.com/articles/LeakyAbstractions.htm...

this was written about by Joel in 2002.

This article probably dates before that, but is last modified circa 2003 (see other comment on that)


netcat solves this.


netcat doesn't "solve" this unix design problem any better than the author's own programs, tcpserver and tcpclient [0], solve it.

djb's argument here is that TCP sockets are more like pipes, with separate read and write buffers, and separate read-side and write-side close operations. This makes sense, but what about UDP sockets? What about operations that apply to the socket as a whole, like bind(2), listen(2) or ioctl(2)?

[0] http://cr.yp.to/ucspi-tcp.html


Plan 9 doesn't separate the read and write fds, but does provide a separate ctl file to replace ioctl. This may be a cleaner approach: control signals are no longer "in band" operations on the data fd (or fds). Logically it seems better to have a single fd representing a full-duplex connection, and expecting the programmer to keep track of the connection state (or possibly receive an error if they don't)


This makes me long for a djb vs. Linus flamewar. Well, no, not really, because that would be counter-productive, but man, it'd sure be epic.


Getting completely off topic now, but I would actually pay money to read the spectate that. We just need to find something they disagree about that they also both care enough about to rant at each other for.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: