Warm tip: This article is reproduced from serverfault.com, please click

Why don't I see pipe operators in most high-level languages?

发布于 2009-10-19 03:39:31

In Unix shell programming the pipe operator is an extremely powerful tool. With a small set of core utilities, a systems language (like C) and a scripting language (like Python) you can construct extremely compact and powerful shell scripts, that are automatically parallelized by the operating system.

Obviously this is a very powerful programming paradigm, but I haven't seen pipes as first class abstractions in any language other than a shell script. The code needed to replicate the functionality of scripts using pipes seems to always be quite complex.

So my question is why don't I see something similar to Unix pipes in modern high-level languages like C#, Java, etc.? Are there languages (other than shell scripts) which do support first class pipes? Isn't it a convenient and safe way to express concurrent algorithms?

Just in case someone brings it up, I looked at the F# pipe-forward operator (forward pipe operator), and it looks more like a function application operator. It applies a function to data, rather than connecting two streams together, as far as I can tell, but I am open to corrections.

Postscript: While doing some research on implementing coroutines, I realize that there are certain parallels. In a blog post Martin Wolf describes a similar problem to mine but in terms of coroutines instead of pipes.

Questioner
cdiggins
Viewed
0
Mark Aufflick 2009-10-19 11:59:45

You can do pipelining type parallelism quite easily in Erlang. Below is a shameless copy/paste from my blogpost of Jan 2008.

Also, Glasgow Parallel Haskell allows for parallel function composition, which amounts to the same thing, giving you implicit parallelisation.

You already think in terms of pipelines - how about "gzcat foo.tar.gz | tar xf -"? You may not have known it, but the shell is running the unzip and untar in parallel - the stdin read in tar just blocks until data is sent to stdout by gzcat.

Well a lot of tasks can be expressed in terms of pipelines, and if you can do that then getting some level of parallelisation is simple with David King's helper code (even across erlang nodes, ie. machines):

pipeline:run([pipeline:generator(BigList),
          {filter,fun some_filter/1},
          {map,fun_some_map/1},
          {generic,fun some_complex_function/2},
          fun some_more_complicated_function/1,
          fun pipeline:collect/1]).

So basically what he's doing here is making a list of the steps - each step being implemented in a fun that accepts as input whatever the previous step outputs (the funs can even be defined inline of course). Go check out David's blog entry for the code and more detailed explanation.