|Question and Answer Session
(18 Apr 2006 at 01:01)
"What is the haps, Tom?"
A."Easter and Taxes! But taxes sure are boring. Let's not talk about that. On Easter weekend we helped to prepare a lovely and alarmingly gourmet dinner replete (this usage problem is thus placed in the blog post like a hidden Easter egg for you, grammarnerd) with mojitos and microbrew, and to demonstrate our ability to be simultaneously classy and childishly base we also roasted marshmallow peeps over a Sterno on fondue forks to make S'meeps and microwaved them so that they'd explode Mr. Wizard style, encouraged or at least prophesized by the fortune-cookie message Ms. Nerd's mom sent with the Peeps, namely
and we also dyed Easter Eggs just like old times. You have already seen some of those eggs above. If you look carefully, a few of the eggs appear toxically slick, because they were coated with a strange 10¢ egg-dyeing kit which still stains my hands more than a day later, despite scrubbing with the most potent solvent known to mankind (nail polish remover), so I am just going to wait until my fingers exfoliate and take that sticky shit with it, like how sharks don't use dentists or teeth-sharpeners, they just keep squeezing out new rows of teeth when their old ones get dull.
HELP STAMP OUT PEEPS ABUSE
Speaking of cavities, Neal and I have just invented a new candy. It's called milk coffulate and it's just plain old milk chocolate except that where every granule of cocoa dust was once present, it is instead replaced by finely ground coffee bean. What do you think? Does this confection already exist? Often when I have a good idea it is in fact the case that it has already been invented.
Continuing in reverse chronology: On saturday it was national Really Nice Day in Pittsburgh Day, and Cortney took me on my first ever urban bike riding trip. We went downtown on the Furnace Trail and then to the South Side. I find running or bike riding around town to be really empowering much unlike driving a car, which I find really panic-inducing and crummy. I still need to build up some ass callouses or whatever the analogue of callouses are for muscles because my butt is really sore from that hard seat, but I'm looking forward to more biking as the weather gets even nicer. That night we went and saw Winterpills and Rosie Thomas at the Club Café, which was a good show as usual. When I visited the merch table to pick up a Winterpills CD the guy invited me to visit the band's myspace profile, which means that either I am too easily pegged as an internet kind of guy or myspace is even more commonplace than I have yet to become comfortable with. But now my band's myspace profile is friends with his band's myspace profile, whatever that means.
Speaking of internet entertainment systems, my internet music friend MAT64 has a new album TurboLoad out which I am rocking to right now. The last two songs are especially good. This is Commodore 64 chiptune stuff, so only listen if you like to
Wow, is this long enough yet or what? Speaking of long, in a bit of spare time I made an entry for the 2006 Underhanded C Contest. (I've always wanted to enter the International Obfuscated C Code Contest, but the competition there is really stiff; I figure it'd be better to get in on the ground floor with a relatively new contest.) Here the idea is to write a program that appears to be a very normal and easy to understand program but that conceals some devious behavior. In this case, you have to write a simple "benchmark" application that runs with vastly different performance on two different operating systems. (Mine runs about 130 times slower on Linux than it does on Windows.) This kind of contest is almost as fun as the IOCCC, but makes a more important point: Whereas everybody sort of already knows that one can write very obfuscated code, this contest makes it clear how easy it is to write programs that look trustworthy (even under substantial scrutiny) but that contain malice like backdoors and such. This fact is really damning against software projects like OpenBSD that attempt to build secure and robust code merely by pouring a lot of time into looking at and testing it. This kind of contest really shows that this is a losing battle if there are malicious parties involved, and highlights the usefulness of something rather more infallible like Proof Carrying Code for ensuring the saftey (or some day maybe even correctness) of code from unknown origin. Since the contest doesn't end for another few months though, I'm going to sit on my entry in case any brilliant enhancements occur to me.
Speaking of programming contests—one last thing—we have been pouring a lot of time into the ICFP contest and I think it is going to be just so damn fun this year. Pass on the word to your programmer buddies and please do take a look come July."
Did you like, do anything for Passover? Like, it was a major holiday season: Passover, Easter, Taxes.
|Yeah my inventions keep getting invented too. See: two-door mailbox, electronic sheet music, etc.|
|Is there something more to what "electronic sheet music" that I should know about? Alls I'm picturing is, like, Finale or something.|
|> Often when I have a good idea it is in fact the case that it has already been invented.
Yeah - something like that happened to me just a couple hours ago. One day several years ago at Aladins you asked if there was some computer vision technique that could be adapted to make a good cubase (or whatever) plugin and I was like "apply Bill Freeman's VISTA to make one thing sound like some other thing", and today i saw a Wired article for something called "sCrAmBlEd?HaCkZ!" which is sort of the same idea. Except VISTA is still much cooler than S.C. because it forces the adjacent sound clips to blend naturally together using a markov random field, in addition to just sounding like the original. Hopefully it would sound a lot more like the source database.
Tom can attest to the fact that this is almost exactly the same design as my invention!!
|Was the Scrambled-Hackz thing by Brian Whitman? Just a guess.|
|max: ok, so it is a simple idea, but a very good one.|
|> Was the Scrambled-Hackz thing by Brian Whitman? Just a guess.
No. That guy looks pretty cool, though.
|Your slander against OpenBSD is quite ridiculous. I might skim these code entries and think they were okay, if rather poorly written, but they would hardly hold up to "substantial scrutiny." Furthermore, auditing is not at all the only thing OpenBSD does. Have you not heard of W^X, ProPolice, or the recent randomized mmap()/malloc() work? Why are you commenting on something you know nothing about?
My advice is, go back and stick to your proof toys. Whatever floats your boat. But spitting on people who provide real security -- which you have demonstrated your ignorance of -- will only make you look like a joke.
|dc-dc: Sure, I know about all of that stuff. What makes you think I don't? But they are just last resorts for handling broken C code. By that I mean, in the case that a C program goes wrong, protections like StackGuard or PAX or whatever just make it more difficult (but seldom impossible) to exploit the bug. Fine, that does add some security in practice, but it's not nearly as secure as writing correct code in the first place. Moreover, they do nothing to protect against maliciously concealed backdoors except to modestly limit the variety of techniques that can be concealed. I claim the OpenBSD methodology of code scrutiny simply doesn't work--this is easily seen by the number of security patches that continue to be released for BSD (and Linux, etc.) and BSD software like openssh. I know, I subscribe to bugtraq, and I have been rooted on an up-to-date system running minimal services.
I don't know if you're actually interested in having a dialogue about this, or if you're just spitting venom, but my criticism is in fact based on knowledge and study of the problem. I'm studying for my PhD in Computer Science on a related problem, even. If you'd like to respond to my actual points, perhaps by explaining why you think Proof Carrying Code is an inappropriate solution, or how code scrutiny can scale to larger and larger programs despite its apparent failings on current code, I am listening. I promise a level-headed reply. Honestly, my guess is that you don't actually know what PCC is about, but I wouldn't criticize you for that as it is more a failing of the academic community for not making these ideas accessible to the mainstream. Still, if my guess is right, you do yourself a disservice by being so dismissive of it. Mathematics is nothing but clear thought, and clear thought is precisely what is required for the analysis of complex programs.
|They're coming out of the woodwork! I suspect Google is involved here...
Seriously though, I think we can agree that the current state-of-the-art in developing exploit-free code involves some kind of combination of the following:
1.) Code Reviews
3.) Static Analysis Tools
Each one of these has its drawbacks. Code reviews certainly can find exploits, but their primary drawback is that they don't scale very well. The result is that they are used on large software projects, but will only cover a small percentage of the actual code-base (when doing formal code reviews) or don't usually cover it very thouroughly (in more ad-hoc style code reviews).
The point is that even now code reviews must be combined with other techniques. The patchwork of different techniques can lead to relatively well performing software, depending on how well they are each done. I think OpenBSD does a relatively good job because they've tried to focus their attention on security. But this is certainly is not an ideal solution.
I eagerly await proof carrying code and provably correct software. For now, we can just use what seems to work best.
|I eagerly await Tom7BSD, where all the code is proved to be correct.
No, I'm sure your work has value, but for the time being, I don't see anyone using it to write operating systems. And you say you have "damning" evidence against OpenBSD as though you've exposed them as a bunch of frauds. Show me the more secure OS that I can use today. If you can't do that, don't try and sink their work and say it's a waste of time.
In theory, your way might be better once it gets out of the research phase, but that's just speculation, isn't it? In practice, the bad code in your examples would be easily spotted.
You said some stuff that's downright silly. For example: "...protections... make it more difficult (but seldom impossible) to exploit the bug." That's simply not true; they do make a lot of attacks impossible. Then you say "the OpenBSD methodology of code scrutiny simply doesn't work" based on "security patches... for BSD (and Linux, etc.)". The other BSD projects and Linux etc. are obviously not using the OpenBSD methodology. My one-year-old release of OpenBSD (3.7) has in fact only had eight security patches, which is not that much. You're not being realistic with your criticisms.
Finally, regarding your mentions of PhD work, following Bugtraq, etc., please look up "argument from authority." (Spoiler: It's fallacious.) I don't think proof-carrying code is worthless -- I would lose that argument. But that doesn't make your attacks on OpenBSD any less ridiculous.
So how near term do you expect this proof stuff to enter the mainstream? 5 years? 20 years? When hell freezes over?
Say, here's a challenge for you. Make an AAD using sound synthesis software written and proven correct in the same day. Can you do it? I've been thinking of trying the same, except without the proof part.
|To "make a lot of attacks impossible" is precisely to make it more difficult, but not impossible, to exploit a bug. It is impossible to exploit a bug if and only if all attacks based on it are made impossible.|
|I want to keep this above the belt, but I feel the need to respond to these unfair remarks:
"... please look up 'argument from authority.'" I mentioned my credentials only in response to your accusation that I am "commenting on something [I] know nothing about." It's simply not true that I am ignorant. This is not argument by authority, because I never claim that my arguments follow from my expertise, just that they are informed by it.
Second, you are misrepresenting my remarks. I never said that there exists a more secure operating system today than OpenBSD. I never said OpenBSD is a waste of time. You seem to inrepret my remarks as insisting that people should drop what they're doing and switch to a magical solution that is not yet available outside of research communities. How could I insist that? I'm running linux this very second. But I do think it makes sense to own up to problems with the way we build software, and be open to new techniques for solving them.
So, let me say again what I am actually claiming. I'll number them so that if you'd like to say I'm saying something, you can just point to the number and respond.
1. Even if OpenBSD is the most secure practical operating system available, it is not secure. I am completely confident that there will continue to be security patches for your installation of the OS over the next year.
2. Observe that these bugs resulted (as far as I know) from accidental mistakes. Therefore, code review does not catch all accidental mistakes.
3. This contest reminds us that it's easy to maliciously embed bugs. Anything we can do by accident we can do on purpose, and I'm assuming that maliciously hidden bugs are harder to find than accidental ones, which I hope is not controversial. So if this is true, code review finds even fewer maliciously hidden bugs than accidental ones.
4. Formal techniques are superior because the intention of the coder is irrelevant, and the attention of the reviewer is unlimited. Whether he makes a mistake or deliberately tries to hide devious behavior, the proof-checking problem is just as easy. Computers are supremely patient, and thus can do a much more thorough job, much more often, than people.
5. Because the codebase is getting larger, and the contributors more diverse, we can expect more accidental bugs and more malicious bugs. So, not only are current techniques unable to find all bugs, the problem is getting harder. We need new techniques!
This is all I was saying in my original post. Some things have come up since then that I'd like to clarify my position on.
A. I stick by my statement that the countermeasures that we listed make it "more difficult (but seldom impossible) to exploit the bug." (Note this doesn't imply that it never makes them impossible, as you seem to claim I was saying (?)) Techniques based on randomization are not impossible to exploit (StackGuard, ProPolice, randomized malloc/mmap), nor do they address all kinds of bugs. W^X (and the analogous part of PaX) does absolutely protect against code injection, which is great, but not against returning to parts of code that already exist and might implement the malicious behavior for you. You know this, naturally; people on bugtraq have demonstrated attacks and the documentation for the software admits to the fact that the protection is not perfect.
B. Even though they are not perfect, the existence of these countermeasures does increase security. I hope that vendors of all operating systems start turning them on by default, the speed penalty be damned.
C. Nonetheless, I think the focus looking to the future is wrong. By accepting that some programs have almost completely unpredictable behavior under bug conditions (ie., most C/C++ programs, ie., most BSD/Linux/Windows programs) and then attempting to limit the damage that can be caused by that wild behavior, we're giving ourselves a much harder security problem than we need. Instead, I maintain that we should use formal techniques to guarantee that such "undefined behavior" can't arise in the first place. The great thing about these techniques is that the protection is automatic and absolute (up to bugs in the implementation of the proof checker or compiler, which are limited to a single source rather than every application, and which C/C++ programmers already have to live with anyway).
D. Some formal techniques, like static typing, are available today in practical form for many applications. If you want to write applications that can't have buffer overflows, integer overflows, heap corruption, double-free, format-string vulnerabilities, etc., are easier to write (and review) than their C counterparts, and have very competitive performance, you can download an SML or O'caml compiler today and go for it. I write almost all my software in SML, and it's perfectly practical for real applications today. It doesn't give you a proof of correctness, but it does give you a guarantee of memory safety. That's enough to exclude with certainty a large class of security problems that we have to patch our BSDs and Linuxes several times a year for. It also addresses other problems of large software programming (like debugging and blame assignment); it's really a win on many fronts.
I can't predict when we'll have practical tools for writing applications and proving them correct. There are lots of people working on it. I'm even less optimistic for near-term mainstream acceptance. One of the reasons is that communities like OpenBSD's are very closed to the possibility of using new languages. ML is over 30 years old; just in the last decade have we seen some small fraction of the ideas there make it into mainstream languages like Java. (PS. I don't think Java is a good language or representative of what I'm arguing for.) I wish I knew how to reach slashdot kind of folks with these languages, because I genuinely believe that they would be appealing to exactly the same kind of person who chooses Linux or BSD because they are better than Windows, despite being less popular. Unfortunately from my perspective there is a kind of Zeitgeist that it is better to suffer through writing painfully low-level C code, carefully avoiding its many dangers--that somehow through suffering one produces a better product. I think that the OpenBSD folks are especially guilty of this. Suffering is not virtuous. We have better ways already--especially if security is your #1 goal--and your original post is a good example of the kind of allergic reaction that many people from the "hacker" community have when faced with languages more modern than C. I don't really understand this, and I wish I did. Perhaps you have some insight. Academics absolutely do have the responsibility to make their work accessible, but "real world" programmers also have the responsibility to then try that stuff out and make concrete criticisms--or at least not demand to see futuristic stuff if they have not yet looked at stuff that's 10 or 20 years old. The mainstream should not flow only one way.
As for your challenge, I'm not really sure I understand the point. I'm not saying that writing correct code and proving it is fast, so the 1-day time limit seems undue. I also think the cost of errors in sound synthesis isn't very high, so the challenge of proving it correct also seems undue. But I will say that if I were to write some procedural music software (which is actually high on my list, so it may happen) today I would definitely use SML. Can you explain what it's supposed to demonstrate, or propose an alternate challenge?
|*BSD is dying|
|It's supposed to be a cool stunt that may yield interesting results, just like the whole AAD thing in the first place.
I mostly agree with the substance of your points, but you offend me with style. Your general tone seems to be that people who are doing real-world stuff are drawing with crayons and should grow up.
I've experimented a little bit with Scheme and Haskell (not SML), so I am not allergic to higher-level languages. Here are a few reasons they may not be in wider use:
1. Performance, disk space, memory usage, etc. I know it's trendy nowadays to suggest that these are no longer concerns, and for us rich kids with new PCs, maybe they're not. But OpenBSD (for example) builds and runs on a lot of older machines that you would consider pretty slow. There's also embedded usage to consider, and small machines -- OpenBSD runs on the Zaurus, a handheld machine too small to fit libraries with debugging symbols -- and there's donated computers in third-world nations, and so on.
True, you can write just your most performance-critical code in C, but that doesn't save you from the disk and RAM requirements.
2. Operating system kernels have to do a lot of low-level stuff. That really calls for a low-level language. There's microkernels, but they're only beginning to get past the research-project stage.
3. These languages are moving targets. C has essentially been the same since 1989, and there's something to be said for that.
Even if the high-level languages are fairly old, like Lisp which has been around half a century, they've only been put to limited real-world use. The best way of doing such-and-such a thing may still be in the process of being hashed out. The libraries and implementations are less mature, in other words.
4. The implementations are big, and OS project leaders may see them as clunky, poorly designed, etc. In particular, OpenBSD has a very high standard for code quality in its base system, and these high-level language implementations, being relatively immature, may have hackish parts to them. There may also be licensing issues, since a lot of language implementations are unfortunately under the GPL/LGPL or worse. (Not SBCL, GHC, MLton, SML/NJ, or TILT, though.)
It's worth pointing out that not too long ago, OpenBSD rewrote their package management tools in Perl, partly to avoid having to worry about buffer overflows. I'm not saying Perl was the best choice, but there you go. I've also noticed you wrote Escape in C++; why is that?
By the way, why can't you run your compiler through your proof checker? I can understand the issues with proving correct the runtime system or the proof checker itself, but the compiler? There shouldn't be any problem checking that.
That's my comment on the above back-and-forth. :)
|I think it's funny that this entry is called "Question and Answer Session"|
|Looks like Mr. Anonymous has some personal interest in OpenBSD and defends it by attacking its critics. Not very scientific, but it's often effective in the real world, just ask George "you're either with me or against me" Bsh. Such behavior has no right to claim high ground...
|dc-dc: Okay, well I am glad you mostly agree with me. I don't understand what it is about my style that offends you, but honestly with comments like "go back to your proof toys" (etc.) I'm not sure I want advice on this subject from you.
I agree that C still has its niche, and wasn't suggesting that we have any good replacement for it today. The same methods, like type systems, that we use to ensure the safety of high-level languages also apply to low-level languages, and there is plenty of research into low-level languages with no more memory or disk usage than C.
But for the vast majority of its applications and the the vast majority of its users, C is an inappropriate language. This is especially true if the project's foremost goal is security.
Using C++ for Escape was probably a mistake; it is very painful to use and often keeps me from getting around to adding features that I want to add. The reason was that at the time I started the project, it was not easy to link to the SDL graphics library using mlton nor compile for platforms other than linux. I hope that I get the time to rewrite it in ML some day.
I disagree with point #3, especially in the case of SML. SML is the most formally defined real-world language that I know of, and the only one for which people have proved properties about the language itself in a mechanized way. C changes (like with the C99 standard) from time to time like any language (SML was revised in 1997 and is undergoing another revision process right now). C is a much broader target in the first place because of all of the stuff that is implementation defined. (In fact, I would claim that the implementation- and un- defined behavior is really at the heart of the security issue.)
Additionally, the licensing argument is a canard because I'm suggesting switching away from gcc (which is GPL, of course) to compilers that are BSD licensed. Thus the situation is exactly the opposite of what you say.
You're right that a compiler like any program is subject to the same arguments. Proving a compiler correct would mean stating its behavior as a proposition, which means defining the source language formally and formalizing some subset of the architecture. Xavier Leroy is working on something rather like this for a simple language. We also have compilers like TILT that produce typed assembly language, certifying that the binary that is produced is at least memory safe.
As for things we can do today, it's totally possible to write compilers in safe languages and enjoy memory safety and other properties for them as a result. Every SML compiler I know of (at least 5) is written in SML, but that's mostly just because ML is a great language for writing compilers.
I have a feeling this argument is coming to a close, so let me just summarize my position again:
1. Since code reviews don't catch all bugs, but security relies on catching all bugs, we want methods that are more infalliable.
2. Some methods are available today (like safe programming languages) that would instantly make applications for which they're appropriate more secure. In particular they'd make the largest classes of security problems in unix software impossible.
3. Looking towards the future, it's important to not be dismissive of technology that can make programmer's jobs easier. Technology is nearing that would make formal analysis practical for a wider variety of programs and properties. Formal analysis is less falliable, scales better, and is more maintainable (because it does not require a human to look at the entire code every time more code is introduced). Like high-level langauges, getting a computer's help is easier and works better, but that doesn't make it a toy. OpenBSD's charter should be about security, not suffering.
|The main point I'm getting out of this is "I should take another look at SML," and I will. Previously, pretty much everything I'd read about the language was bad: Lisp users bashing it for being static and thus lacking expressive power; Haskell users bashing it for being impure and non-strict and thus lacking expressive power; and users of the language who say it's "dead" and they need to make a new dialect of it so they can add much-needed features (thus "Successor ML"). I figured if the language is criticized from all these different angles, it must suck. But now I'll look at it again.
The licensing argument may not be applicable specifically to SML, but it applies to Ocaml, which you also mentioned. I need C either way for my operating system, and, I'd guess, for parts of the language's runtime system, so that's moot. Also, gcc is not the only C compiler out there: there's TenDRA, for example.
Conclusion: We're both right.
Also yes, the wording in the first post from this IP address was ill-chosen. Sorry about that.
|Moral: decide for yourself what "expressive power" means, by trying to express your favorite programs and invariants in various languages. Exercise: how would you express abstraction invariants in lisp, seeing as how it lacks a static type system?|
| The judge who put coded messages in his Da Vinci Code plagiarism trial ruling has written another...|
|what is type casting