p e r s o n a l |
Why ''..'' dirent?
(29 Oct 2003 at 18:45) |
Does any unix nerd out there have a good argument for why the parent directory is linked in every dir as ".."?
It seems to do more harm than good. For example: - The root directory has no parent, so .. is the same as . there. - When arriving into a directory by cd'ing to a symlink, cd .. does not bring you back to where you were, but rather to the directory's "real" parent. - Hard links to directories screw everything up (read the warning in ln!), and I think this is because of ".." again. - In a diropen, one often needs to explicitly ignore or special-case ".." and "." - Security holes are common in, say, web scripts: by sticking ".." in a filename, the remote user can escape the nominally published directory tree. - The wu_ftpd server (and any program that passes wildcards to the libc globber) had a denial of service security hole by passing "../*/../*/../*/../*/" (etc.) as an argument to NLST.
So, what's the deal? If I were to design a filesystem today, I would just leave out "." and "..". What problems would I run into? In such a filesystem we could easily support true DAGs (ie, hard links to directories, as long as they weren't cyclic). The pervading environment (ie, shell) would simply keep track of how we got to where we are and then provide us with a "cdup" to go back. (I also think it wouldn't be too hard to provide arbitrary graphs on directories, but it's not clear that that's a good idea!) | |
|
You'd give up the ability to do things like:
"mv foo .."
"cd somesymboliclink ; cd .."
You can do the first by cd'ing up first, or by using absolute paths, but that's the whole point of "..", to allow you to make filename references--not just 'cd' references--that go in any direction without having to go up. If you have ten files you want to move up a directory, it's a lot easier to say "mv a b c d e f g h i j .." then to say "cd ..;mv mydirectory/a mydirectory/b mydirectory/c mydirectory/d mydirectory/e mydirectory/f mydirectory/g mydirectory/h mydirectory/i mydirectory/j ."
Hierarchies are easier to understand and remember than DAGs; objects have a single clearly defined "location".
Alt: why would you need to take *out* .. entirely to do what you want? (Do unix hard links not allow true dags? Heck, don't soft links allow dags?) It seems like all you need is a shell "cdup" to get everything you want, and taking .. out is unnecessary.
NB: I haven't used unix in 10 years, but I still use ".." in commands all the times in dos boxes in windows. |
I'm not suggesting that the pervasive environment not support any notion of "parent directory". Er, I am suggesting that the idea of ".." be handled by storing a *path* from the root as the process' cwd, rather than just the working directory. mv and other programs that manipulate files would need to understand that, but they already make special cases for "." and ".." anyway, so, no big deal.
In this system, you can combine your two examples with sensible (to me) results:
cd /
mkdir a
mkdir b
cd b
ln (-s) /a c
cd c
touch x
mv x ..
x is in /b/x, not /x.
Unix hard links don't support DAGs. Soft links don't, either, because the parent pointers (..) have to be unique.
|
Shells aren't the only place where you use .. though. They are useful for specifying relative paths in other environments, too. For example, I use .. sometimes for specifying the paths of files in a development project so that I can move the entire project dir (or more importantly, other people can check it out from cvs to anywhere on their system) and things will still work. Along the same lines, sometimes it's useful to use .. in #includes in source files. I'm not saying that there might not be a better solution for this, too, but I'm just saying that it's not just the shell environment that you have to think about. |
I'm not asking to get rid of ".." entirely. There should definitely be a way to open files "from the parent directory", etc. It's just weird that ".." is an *actual* directory entry in each dir, that links up to its parent.
Each process currently stores its cwd ("current working directory"). My alternate suggestion is to store the cwp ("current working path"), which would be a sequence of directories that the process took to get to wherever it is. The filesystem calls like open can take this into account, and there can be a way of writing "parent directory" (we can even write that ".." to be compatible with existing notation) in the specification of a file so that you can still do all the things you can do today. It's just that instead of following the ".." links in each directory, it would instead use the process's cwp.
.. is already a special case in all sorts of applications. For instance, suppose I do this:
cd /
mkdir a
mkdir a/b
cd /
cd a
cd b
cd ..
cd b
cd ..
cd b
cd ..
pwd
I get:
/a
NOT:
/a/b/../b/../b/../
even though that's logically what I did: When I say cd to a relative path, it means, go into that directory. However, the shell understands that when I "cd ..", that shouldn't be in my working directory (even though /a/b/../b/../b/../ = /a in the above example), it means instead, "go up." If it's special case, and needs to be treated specially in all sorts of applications, why not fully admit that it's a special case and then not make it look like a real directory entry in each dir?
|
Ok, I guess it's fair to suggest that '.' and '..' shouldn't appear as enumerated files when you query for a list of files in a directory. But right now things like 'mv' and any other application does NOT have special handling for '.' and '..'; that stuff is either handled by the runtime library (e.g. 'fopen'), or, more likely, by the underlying OS, so it's confusing when you say they're already special cased.
On the other hand, if you were to try to manually expand "*/*/*", you DO have to special case '.' and '..' so you don't use them, since "*/*/*/" is definitely not supposed to refer to "../../..". In unix you don't worry about this since the shell does it, but I do have to write code for this in programs when I'm recursively traversing directories. So not enumerating would be nice, yes. But I need some kind of OS interface that can say something like "../../foo", regardless of how this functionality is implemented under the hood. Maybe you'd be happier with "^^foo" or something like that which didn't look like a normal filename.
Doesn't your example with the soft link work perfectly fine under unix, except that the mv goes to /x ? This is why I was confused what you meant about DAGs, since "proper" DAGs don't have parent pointers at all, since they're *acyclic*, hence Unix can do them fine. But yes, in the sense of being able to backwards-traverse the DAG, you're stuck with unix. |
I think they are special case, actually, in almost any application. For instance:
mkdir /a
mkdir /a/b
mkdir /a/c
cd /a
mv b c (works normally)
mv .. c (doesn't work)
mv . .. (doesn't work)
In this sense "." and ".." *don't* act like normal directory entries.
Otherwise, yeah, that is what I am suggesting. Give the process a way to know its path, and then it (or libc) knows how to go 'up' when a filename specifies 'up'. I don't care how it's spelled, only that the underlying filesystem has consistent semantics.
Yes, in my example earlier it goes to /x under standard unix semantics. That's a little strange because we arrived at that directory by cd b ; cd c. What I was saying was that ".." doesn't correspond to "up" if you try to make DAGs in unix, ie, "cd dir" and "cd .." are not inverses. |
|
|
|