Announcement

**loinburger** · July 23, 2009, 17:38

Reverse Engineering

It's a Good Thing to have open encryption standards, because it means that a lot of the Good Guys are able to break the crypto if the crypto is breakable. Keeping the cryptographic algorithms secret is known as "security through obscurity," and generally means that only the Bad Guys are able to test the algorithm (the Good Guys would probably be breaking intellectual property laws if they reverse-engineered the code, whereas the Bad Guys don't care). Any algorithm can be reverse-engineered with some static analyses and a debugger, so it's best to assume that everybody will know how your crypto is implemented. Security should rely on the key being kept secret, not on the algorithm being kept secret.

Timing / Power Analysis

The problem being that public-key cryptography is complicated, which makes it susceptible to side-channel attacks. Diffie-Hellman and RSA (two popular public-key algorithms) rely on the strength of performing operations on large integers (large meaning hundreds or thousands of bits long). However, if the implementers are sloppy then this can result in leaked key information - if you're performing operations on word-sized (32-bit or 64-bit) integers then they'll always take the same number of cycles, but if you're performing operations in 1024-bit integers then the number of cycles needed for an operation is dependent on the values of the long-ass integers (e.g. sometimes you'll find the greatest common divisor in a short amount of time, other times you'll require a lot of iterations before you find the GCD), so an attacker can time the operations performed with various public keys in order to deduce the value of the private key.

A related problem is that operations on large integers may not always exercise the same code paths. When the algorithm is implemented on a smart card, an attacker can probe the card to determine which public keys exercise which code paths, and use this information to deduce the private key's value.

Blowing Stuff Up

Another problem with storing secret keys on devices that aren't physically secured (e.g. smart cards) is that the algorithm designers usually include the implicit assumption that e.g. nobody is going to take a hammer to the algorithm, so they'll store the secret keys in RAM knowing that if the power is cut (because somebody is attacking the system with a hammer) then the RAM is wiped. However, because of its physical properties, if you freeze RAM then it maintains its charge for minutes or hours, so you can get the secret keys out of a computer by dunking the thing in liquid nitrogen, removing the RAM chips, and reading off the memory before it loses its charge.

Rubber Hose Cryptanalysis

One of the disadvantages of cryptography is that encrypted data are statistically different from unencrypted data. F'rinstance, in a plaintext file we'd expect to find the ASCII or Unicode representation of 'e' to show up very frequently, whereas an encrypted text file is more-or-less random. This means that if a country passes a law against using cryptography, then the authorities can track you down just by running a statistical analysis on the data transmitted from your computer. (More likely the authorities will only outlaw strong cryptography but allow you to use weak cryptography i.e. crypto with a too-short key. That way they can pinpoint the data that you think are important enough to encrypt and ignore the dross that you can't be arsed to encrypt.) Even if a country permits crypto, that doesn't necessarily stop the authorities from forcing you to surrender the crypto keys via threats or application of sanction or force (hence the name "rubber hose cryptanalysis").

The solution is called steganography. This is intended to provide you with plausible deniability - maybe the authorities will stop torturing you if they believe that you've already surrendered all of the crypto keys. Here's a vanilla example: you email somebody an encrypted bitmap of your genitals. Embedded within this bitmap are the encrypted plans for world revolution of some sort - each pixel of the bitmap is represented by, say, 16 color bits, and you tweak the least significant color bit so that you achieve a transmit rate of one bit per pixel. The authorities flag the bitmap as being encrypted and, suspecting you of fomenting world revolution, tell you that they will execute your family if you don't surrender the crypto keys. You relent, and the authorities decrypt a picture of your genitals. Obviously you'd want to encrypt this picture when you post it to amateurpornrevolution.com (a site for revolutionaries who are also amateur porn stars in their spare time), otherwise everybody will get a gander at your genitals. The authorities don't execute your family, the porn stars successfully launch their revolution, wutang for steganography.

Except it's not that simple. A statistical analysis of the decrypted bitmap will reveal that the least significant bits have been tampered with - the manipulation is not visible to the naked eye, but it's glaringly obvious to a computer. At this point, steganography still relies on security through obscurity - it relies on the assumption that the authorities won't know about your bit-twiddling antics, and this is not a safe assumption.

Steganographic file systems are potentially more secure, if you don't mind losing storage space. A stego file system fills the entire free space with encrypted random numbers, so if your encrypted partition is 100gb then it looks like you've got a single 100gb encrypted file (until you enter a password). The password unlocks the visible encrypted portion, which you'll probably want to fill with porn. Meanwhile, your plans for world revolution are encrypted under a second key in the hidden sub-partition. You access the stego-system by providing a file name and password, and if either fails to match then the file system gives a "file not found" error. The plans for world revolution look no different than the blank space on the partition, so you've got plausible deniability... if you don't mind losing your data. The only way to plausibly claim that the stego-space is part of free-space is to allow the authorities to write to stego-space if that's what floats their boat. Hope you had a backup copy of your world revolution plans stored somewhere else...

Multi-Level Security

One of the holy grails for operating systems security is to be able to have unclassified, secret, and top secret information all on the same system, with a security policy enforcing the flow of information (IIRC it is the Bell LaPadula model that says that you can write to a higher security level or read from a lower security level but not vice-versa). Two reasons why such an operating system does not exist (if somebody has to work on systems at two or more classification levels then they'll have one computer per classification level) are covert channels and Rice's theorem.

Covert Channels

Consider an operating system that stores both unclassified and secret files. Somebody working at the unclassified level can write a file to the secret level, and somebody working at the secret level can read a file from the unclassified level. Unclass cannot read from secret and secret cannot write to unclass. In theory, this prevents unclassified users from having access to classified information.

User Blargh at the secret level wants to leak information to user Poot at the unclassified level. With our hypothetical multi-level operating system, if Poot tries to write to a pre-existing file at the Secret level, then the operation fails (the system doesn't want Poot destroying secret information by intentionally or inadvertently overwriting the secret files). Now Blargh and Poot can communicate by having Blarg generate a series of files representing "true" states and having Poot attempt to write to said files. F'rinstance, if Blargh is trying to communicate a 1-byte value of 00100110, then he'll create the files "0010000.dat", "00000100.dat", and "00000010.dat". Poot will write to the files "10000000.dat" through "00000001.dat," and will record the errors received on the three files that Blargh already made. Blargh has just sent a byte of secret data to an unclassified user.

Rice's Theorem

So why not just cut off all covert channels? Because it's impossible to do so. Rice's Theorem (I'm not going to go into the math because I don't really understand it myself, in my graduate level "theory of computation" class we skipped over Rice's Theorem because it was too complicated) says that any non-trivial property of an arbitrary program cannot be proven, where a "trivial" property of an arbitrary program is a property that is true of all programs or no programs (i.e. I can prove that there does not exist a program that will disprove Rice's theorem). These are the properties we're looking for. So if some researcher says "hey the file system can introduce a covert channel" it is possible that the operating system is simple enough (remember, Rice's Theorem applies to arbitrary programs, not to programs intentionally made so simple that they circumvent Rice's Theorem) for us to remove the covert channel, but the problem of finding all potential covert channels is undecidable at best and (more likely) uncomputable at worst.

Two Points of Failure?

It seems counterintuitive that we'd use two algorithms (public-key and secret-key) to transmit a message, as it introduces two points of failure. Why not just transmit everything using public-key encryption and eliminate the potential failure-point of secret-key encryption? Efficiency.

Some quick terminologicalification: computer scientists use Big-Oh notation to specify an algorithm's approximate efficiency. For example, most sorting algorithms are either O(n * log(n)) or O(n^2), where the base of the logarithm is usually assumed to be 2 (since everything in computer-ese is in base 2 except on those silly base-3 computers they made back in the day or on those silly base-23 (or whatever) quantum computers that we can't get to work properly). 'n' is the problem size. So if you're sorting (n=)1000 elements with mergesort (an O(n*log(n)) algorithm) then you'll need in the ballpark of 10^4 operations (log-base-2 of 1000 is about 10), and if you're using insertionsort (an O(n^2) algorithm) then you'll need in the ballpark of 10^6 operations. We're abstracting away a crapload of stuff here, f'rinstance the actual number of operations needed by mergesort might be something like 15n*log(n)+80n+10^6 while the actual number of operations needed by insertionsort might be something like n^2-50n, in which case for n=1000 insertionsort is actually faster than mergesort, but the point is that for n=huge mergesort is going to be faster because only the leftmost term matters. There's some other crap like big-Theta and big-Omega and amortization and memory-usage blah blah blah and if you're interested you should get a theory of computation textbook and/or ask me to wax poetic about it, but for our purposes you need to know
1. Big-Oh gives the approximate number of operations needed to perform a big-ass calculation
2. Smaller functions are better functions

Public-key encryption is usually around O(n^4). Private-key encryption is O(n). Which would you rather use to encrypt a gigabyte of porn to get around your school/work firewall? Hint: not public-key encryption.

Replay Attacks

I've been mentioning things like timestamps and counters and whatnot, but what gives? These things are intended to prevent replay attacks, which is when an attacker copies an encrypted message that you're sending and then re-sends the message at a later time. Let's say you've got one of those remote-car-unlocking things. I click the "unlock door" button, sending a plaintext packet to the car. The car authenticates the packet (this step is optional, but no sense wasting energy on transmitting to any tom dick and harry with a key fob), then sends a "challenge packet" to my key fob. My fob encrypts the challenge using a shared secret key, then transmits the encrypted "response packet" to the car, which verifies that I've properly encrypted it and unlocks the doors for me. Problem being, if an attacker has been recording all of this, then he can initiate the whole thing (he's recorded my initial plaintext packet), receives the challenge packet, and sends a copy of my response packet. Unless the key or the challenge packet has changed, the attacker has just succeeded in a replay attack.

Counters and timestamps hamper replay attacks by secretly modifying the challenge packet. Let's say that my key fob and the car share a counter, which is just an integer that gets incremented each time we have a challenge-response session. The car sends a challenge packet, I encrypt the challenge packet concatenated with the counter and send the response packet, and the car checks that the response packet concatenated with the same counter decrypts correctly. Now when the attacker sends the duplicated response packet it won't pass muster, because the car's counter will have been incremented and won't match the replayed response packet's counter.

But what if there are multiple key fobs (e.g. his and hers key fobs)? We could design the system to allow for some skew in the counter, but then we open the door up for replay attacks*. We could use timestamps instead of counters, but then we need to worry about clock skew and power draw (your key fobs would have better battery life if they didn't have to power a clock). We could use one counter per fob, but this may have difficulty scaling to a larger system. In the end the method used is dependent on the systems' resources and requirements.

*This depends on how you implement the skew. If the fob sends a response to the car and the car checks to see if the counter is correct +- 3, then we've allowed an (albeit short-term) replay attack. If the fob keeps incrementing the counter until the car accepts the response, then we're able to handle the case where we have a few different fobs for the same car and we resist replay attacks because the counter never decrements (i.e. the acceptable counter range is [0..n] instead of [-n..0..n]), but this means that our energy-constrained fob may be sending an awful lot of response tokens which'll drain the battery and introduce unacceptable delays (if it takes ten seconds for the fob to find the correct counter value then people are going to get impatient and just use their keys instead). Or we could just send the counter in plaintext - rather than the car sending a challenge token that the fob concatenates to a counter prior to encryption, we could instead have the counter be the challenge token - as the challenge token changes each time, we're resistant to replay attacks.

The point here is that it's not the cryptography that's weak, it's the cryptographic protocols. Once identities are established and keys are exchanged you're good to go, but until then you're in very dangerous territory.

**loinburger** · July 23, 2009, 17:45

Why Linux (Or Mac OS X Or Whatever) Is Not Secure

First off, I'm not defending Windows. Windows is not secure for many of the same reasons that Linux et al are not secure. It is more secure for some reasons (better static analysis) and less secure for others (large target painted on it), but the bottom line is that almost all operating systems (that is, any operating system you'll be using) are not secure because they're not trivial and it's impossible to prove anything non-trivial about anything non-trivial.

So what're we trying to prove here?

I'm going to geek out here a little bit but I think I'm going to be staying within the realms of sanity (as well as is possible for me, at any rate). Type systems were developed back in the day to get around paradoxes in set theory. If you have a Universal Set that contains all other sets, then that means that the Universal Set contains all sets and the Universal Set, but that means that the Universal Set has to contain all sets and the Universal Set and the Universal Set that contains the Universal Set, etc. However, if you incorporate a type system into your set theory, then you can magic away a helluva lot of paradoxes - the Universal Set is of type Universal Set and contains all sets that are NOT of type Universal Set, and voila, the Universal Set can no longer contain itself but we can still **** around with it in interesting ways as limited by type theory.

Anyway, type theory helps out a lot in computer science, because in most computers (that is, any computer you'll ever use, unless you're hiding a transputer in your basement or something) everything is just a Bag 'O Bits, and it sure would be nice to know how to interpret said bits without shooting ourselves in the foot. F'rinstance, a non-negative integer is just a binary representation of the number (e.g. 0101 = 0*8+1*4+0*2+1*1 = 5), but a floating point value is trickier. If we add 0101 in integer format to 0101 in integer format then we get 1010 (5+5 = 10, or 1*8+0*4+1*2+0*1). If we add 0101 in integer format to 0101 in floating-point format then we get garbage. So assign different types to integers and floating-points, and huzzah, instead of getting garbage we get an error (which "should" be caught by static analysis and/or by running the program a bunch of times and recording any errors that occur).

So what're we trying to prove? That accesses to one "type" of program element aren't disguising themselves as accesses to another (probably more lenient) "type" of program element. Here's a quick f'rinstance: in our hypothetical language you can add, subtract, multiply, or divide integers. You can do bitwise manipulations to bitmaps. I've got a bank account that stores my balance as an integer, say it's 0001 followed by a bunch of 1's and 0's. If I treat the balance as an integer and add money to my account, then nothing odd occurs. However, if I can **** around with the bank's software and treat my account as a bitmap, then I can 'bitwise-or" my deposit of 15 dollars (1111 in binary format) with the highest parts of my account balance, so my balance of 0001 (times a million) becomes a balance of 1111 (times a million), i.e. I've just earned fourteen million dollars because a type error let me treat an integer as a bitmap.

Hackery Tomfoolery

So why aren't Windows/Linux/Whatever secure? Because they use a language (C, most likely) with a weak type system (I've read varying definitions for what constitutes a "strong" or "weak" type system - my personal definition is "if you're allowed to break the type system, then it's a weak type system"). This allows programmers a tremendous amount of flexibility (good for programmers, bad for the people analyzing/debugging the code) and at the same time a tremendous ability to cut their own throats (also "good" for programmers unless they're on the QA team and want a secure career, but usually bad for manager types).

Here's a simple f'rinstance. The C language incorporates pointers to memory: let's say I'm making a list of objects (e.g. records in a database) that I want to traverse (start at the front of the list and proceed through the end of the list) with the ability to occasionally backtrack (so I can proceed from the back of the list towards the front if that's what floats my boat). The traditional way of doing this is to have each list element carry the address for the next and previous objects on the list, i.e., we need two words per list element.

But wait, we can be ever-so-clever about this! What if we only stored one word per list element??? There's a complicated way to do this that you can google on your own (xor linked list), but suffice to say, you can xor the pointers (addresses) to traverse a doubly-linked list in either the forward or backward direction. The language standard says that this is illegal (you can add or subtract constants to/from addresses, but you can't change an address into an integer, do something illegal with the integer, and change the integer back into an address, unless you know what you're doing, and I have yet to meet anybody who knows what they're doing to this degree).

Blarg

So do operating systems do ridiculous things like the "xor pointer trick"? Probably not - back in the day when RAM was expensive it was probably defensible to do stupid things to pointers, but nowadays it just falls under the "doing stupid things to pointers" category. What operating systems do is try to be invisible (except for the fancy-ass GUI), and xor pointer bit-twiddling won't accomplish that. The real threat is contiguous memory.

Contiguous memory??? **** you!!!!!

So your computer has a bunch of RAM with a bunch of 32-bit (or 64-bit if you're a fancy lad(y)) words in it (a "word" is the largest chunk of memory that a machine can use in one or two clocks, usually it's 32-bits), and you've got a ton of "normal" data structures (you're dealing with an operating system here, the rule of thumb is to stick to tried-and-true data structures for systems programming). "Contiguous memory" is memory that's adjacent. If I've got a list of discontiguous integers, then integer #1 may be at address FE05 while integer #2 is at D0EA. The integers probably aren't both in cache if they're spread out, and accessing memory that's not in cache is expensive. (Random access is also more expensive than uniform access, but not by much.)

Systems programmers address this problem by using contiguous memory, usually called arrays (for identical data structures) or records (for abnormal but non-terminable data structures). So instead of my fancy-ass linked list, I just put every record next to the previous record, and then I can traverse my list with pointer arithmetic (adding/subtracting numbers from addresses). "Buffers" are contiguous chunks of memory - they hold e.g keyboard input until the CPU can process it, but if you give the buffer too much keyboard input then things may go disastrously. Let's say your buffer can contain 32 elements, and an attacker inputs a few thousand elements in a short time period, thereby overwhelming the system designed to process no more than 32 memory elements. If the hacker is dumb/lazy then he can crash your computer with this vulnerability. If the hacker is smart/tenacious then he can use this vulnerability to become your system's or network's administrator.

Fix it in hardware!!!

I've seen this "proposal" from time to time, and the idiots making it are invariably artificial intelligence douchebags who think that it's possible for a computer to figure out if an integer is really an integer or is actually a pointer that is temporarily being treated as an integer. Also bear in mind that the hardware people and the compiler people don't seem to talk to each other, so whenever the hardware people try to help (or eliminate) the compiler people all we get is chaos. Y'know the MMX extensions in Intel chips? Compiler people HATE them. Compiler people want more registers (and maybe software control over cache and things of that nature), they don't want specialized (and (odds are) poorly designed) memory management hardware.

Fix it at runtime!!!

A much more reasonable request. Adve at University of Illinois at Urbana-Champaign has done some work on ensuring that if a memory error occurs (e.g. freeing the same piece of memory twice) then the program will fail gracefully or disastrously (depending on system requirements) rather than letting the memory error break the program in unexpected ways.

Fix it at compile-time!

That's what my dissertation is about. Give me a few years to post the results.

**loinburger** · July 23, 2009, 17:47

Passwords

Password creation is where the purity of cryptography gets really hosed over by the impurity of human beings who write their passwords on post-it notes that are stuck to their monitors. The key word here is entropy - how random is your password? Kolmogorov Complexity is an easy way to describe the problem (note though that you can't actually measure Kolmogorov Complexity - it's uncomputable): how complex would an algorithm have to be in order to describe the code phrase? If I've got a 128-bit string that is completely random (as in, the bit values are selected based on whether Schroedinger's cat lives or dies) then the machine needed to describe the string is huge, and we've got ourselves a high entropy string. However, if I've got a 128-bit string that's just alternating ones and zeroes, then it doesn't take a very large machine to describe the string. In fact, let's just say that our "machine" is the English langauge: the high entropy string is described as "one followed by zero followed by three ones followed by two zeros" etc., and the low entropy string is described as "the substring of one followed by zero, repeated sixty-four times."

So let's say our passwords are 10 characters long, and each character can take one of eighty values (upper and lower case letters, numbers, and eighteen special characters thrown in for good measure). An ideal password would therefore have about 265 bits of entropy (log2(10^80)). However, using an entropy model for a non-ideal password (I'd post the paper about said model, but I'm working at home today and the paper is on my work computer) we go from "password length raised to the power of the alphabet size" (10^80 in this case) to "two raised to the quantity of the password length times the log of the alphabet size (2^(log2(80)*10) for a mere 63 bits of entropy (which is no longer sufficient - DES has 56 bits and is susceptible to brute-force attacks).

There are a couple of ways to fix this problem, and they roughly fall under two categories: stupid solutions and not-as-stupid solutions.

Stupid Solution 1: Have the computer randomly generate high-entropy passwords. Nobody will be able to remember their password, and so they'll carry it around in their wallet or put it on a post-it note or whatever. This is a very popular solution, though, because it shifts blame from the IT / security people and onto the users.

Stupid Solution 2: Lock out users if they enter the wrong password too many times. This prevents brute-force attacks, but makes it ridiculously easy for an adversary to launch a denial of service attack.

Smarter Solution 1: Use pass phrases, not passwords. Pass phrases are much longer than our 10-character password, so even though per-bit entropy is lower we make up for it by having a ****ton of bits. The computer ought to have a few simple rules set up to increase entropy a bit (e.g. all pass phrases must have three special characters and a capital:lowercase ratio of at least 1:5), but this is still a vast improvement over purely random garbage passwords.

Smarter Solution 2: Retry delays. In an online system the attacker needs to have the password checked by a remote computer (a web server, f'rinstance), so the computer doing the checking can institute a rule that e.g. the user can only make one login attempt every thirty seconds. (Depending on the system it may also be appropriate to ban a misbehaving IP address for an hour, as hopefully this scalpel approach will harm legitimate users much less often than the battleaxe full lockout approach that's much easier to implement but much easier to exploit.) Offline retry delays are trickier - this is where the attacker has a copy of the hashed password, so rather than sending the password to a remote system for authentication he can instead do the authentication on his own machine (hash a bunch of passwords, see if any of them match the real password hash, voila). The only delay is the speed of the attacker's machine(s), so that's where you hit them: rather than hashing the password once, you hash the password a thousand times. Passwords are rarely checked for legitimate users (they log in at the beginning of the day and maybe they have to unlock their computer after lunch), so if authentication takes thirty seconds instead of a fraction of a second then the user isn't adversely affected. However, the guy who's brute-forcing the password file is now looking at years of work instead of days or weeks of work.

Preliminaries: arrays and pointers and WARGH

All programs use RAM (unless they're terribly silly programs running on ASICs or FPGAs or whatever in which case you're probably not using them). Simple programs use a fixed chunk of RAM - if I've got a program that does nothing but multiply two 1024x1024 matrices and outputs the result, then I need a little over a million memory units per matrix (the unit size depends on what I'm multiplying, but I'm probably multiplying 8-byte values, so that's 8 megabytes per matrix) times three matrices (two for input, one for output) for a total of 24 megabytes plus maybe a few extra bytes for auxiliary crap (if I'm outputting this stuff to the screen and reading it in from a file then I need those routines loaded in memory). The only way things go wrong here is if a. the user tries to run the program when they don't have 24 megabytes of RAM free (this might just cause other programs to page out to clear room for this program, it might cause thrashing as programs keep paging in and out, it might cause the operating system to tell you that you don't have enough memory to run the program, or it might cause the operating system to vomit all over everything) or b. the user tries to access memory that doesn't actually exist (e.g. the 1025th row of a 1024x1024 matrix). Which leads us to our first problem: array bounds checking.

Array bounds checking!!!!!

You've probably heard of this problem before in the guise of "buffer overflows." A buffer overflow attack occurs when a programmer doesn't check to make sure that people don't e.g. write to the 1025th element of a 1024-element array. When this happens you can get all sorts of errors (spectacular or otherwise), but the ultima thule is a smashed stack. A program consists of a bunch of subroutine calls (unless it's a silly program without subroutines, but you'll usually only find these on embedded systems) - when function A calls function B then the computer saves A's data and loads a copy of B into memory (it's a "copy" because you might have multiple B functions loaded). All of this occurs on the "stack" (again there are exceptions but whatevah), so-called because it's a first-in-last-out data structure (you "push" B onto the stack and when it concludes you "pop" it off the stack). The point of all of this is that you've got both data (the variables associated with A and B that are scoped to the functions, i.e., cease to exist when their respective functions are popped from the stack) and executable code (the executable parts of A and B, e.g. function addresses and program counters etc.), and this is a bad mix.

Assume that A has some of the 1024-element arrays mentioned before, which it places on the stack (it can place the arrays on the "heap" instead of the "stack," more on this later, but buffer overflow attacks almost always deal with arrays on the stack so I'll limit myself to the stack here), and it calls B and gives B the address (the "pointer") of the 1024-element array (hereafter called the "**** you array"). At this point the stack looks like

B
A's saved data (saved when B is pushed)
**** you array
other crap for A
start of A
whatever called A

"A's saved data" is such information as the return address for when B finishes whatever it's doing - B pops, the runtime system checks A's saved data to find the return address ("start of A" on the stack diagram), pops/saves A's saved data, and sets the program counter to "start of A."

Except B has the address of the **** you array, and is about to smash the stack. B writes some data to the 1025th-1040th elements of the array, which amounts to writing data to "A's saved data," which contains A's return address. Oh ****, something has just gone horribly wrong. Here's what the top of the stack looks like now:

B
Address of C
**** you array

Now when B pops, the program returns to the start of function C instead of the start of function A. Function C emails porn to your boss using your work email account. Blargh.

What a terribly silly mistake, why not just prevent B from doing this **** in the first place?

It's easy enough to prevent this - rather than giving B the address for the **** you array, function A instead gives B the address for the function "safe_array_access," which checks to make sure that the array element B wants to access is legal (i.e. "if index > 1024 then shoot B"). Now B can't smash the stack (assuming that safe_array_access isn't broken), but a simple array access now requires an entire function all, i.e., we've just slowed down our program by about one to three orders of magnitude depending on how much the cache hates us now.

Well that sucks, but safety first, eh what old chap?

No. Safety/security is important, but it's not the be-all end-all of design considerations. Bertrand Meyer gives an excellent example when explaining why he designed Eiffel so that the programmer could turn off automatic array-bounds checking if that's what floated his boat: Let's say we've got some weather forecasting software that can (relatively) accurately predict a storm 24 hours in advance. (Weather forecasting software typically makes heavy use of matrix arithmetic.) He was generous with his example and assumed that array-bounds checking "only" doubled the program run time. If it takes three hours to compute the forecast, then that means that e.g. ships have a 21-hour lead time. Is it worth it to add array-bounds checking and change that to a 18-hour lead time? What if it takes six hours without bounds checking and twelve hours with checking - is a 12-hour lead time sufficient? He specifically chose weather forecasting as an example to waylay the usual "Moore's Law will fix the problem!" argument - Moore's Law in the context of weather forecasting does not mean "we'll halve the time it takes to get a 24-hour weather forecast," it means "we'll be able to get a 48-hour weather forecast in the time it used to take to get a 24-hour weather forecast."

In other words, sometimes you should use array-bounds checking, and sometimes it's inappropriate, ergo there is no magic bullet to prevent stack-smashing. And with that I'm going to go back to sleep.

Stacks and Heaps

These two structures are aptly named - the stack is a well-ordered first-in-last-out structure where old data are never deallocated before new data, whereas the heap is a huge cluster**** of data with no rhyme nor reason as to when they're allocated or deallocated. I'm exaggerating a bit, but to be honest the only thing that can be said for certain about the heap is that it's not the stack. For this reason I'll illustrate two example heaps that are used in real programs (I'll stick to manually managed heaps for now, because (semi-)automatically managed heaps are a different sort of beast).

Bump-Pointer With Hideous Free-List Heap

The bump-pointer heap starts out by allocating data just like a stack - new data are pushed on top of the heap (note that "top" and "bottom" are relative, as the heap may grow from low-to-high addresses or vice-versa), which is fast and doesn't waste space... until we start deallocating data. The stack allocates from bottom to top and deallocates from top to bottom; the heap deallocates where and when it pleases. That's where the free-list comes in.

When a chunk of data is freed, it's added to a list of other chunks of data that have been freed. This usually doesn't waste any memory, because the free chunks of data can be used to store the list of free chunks of data. The problem arises when the bump-pointer reaches the top of the heap and we still need to allocate data; now we've got to search the free-list for a free chunk of data that's (about) the right size for our needs, and this can be expensive (bump-pointer allocation is constant-time, free-list allocation is log-time if we're being fancy or linear-time if we're being lazy) and can also waste space (what if you can't find a free chunk of memory that precisely fits your present need? Then you have to use a chunk that's too large, which either creates fragmentation or else creates little chunks of free space that are too small to use).

In the worst case, we find ourselves with a heap that e.g. contains a bajillion non-adjacent 7-byte chunks of free space, and here we are wanting 8 bytes of data. In this situation we can have many megabytes of free space available but we're still ****ed.

Segregated Free-Lists

To help avoid the situation where none of our free chunks of memory can satisfy an allocation request, we can do away with the bump-pointer entirely and start out by splitting our heap into a bunch of differently-sized free-lists. Our choice in sizing determines how much space we might waste as well as how difficult it is to merge lists, so to keep things simple I'll use the simplest segregated free-list, the power-of-two segregated free-list.

I lied about doing away with the bump-pointer, but we're going to be "smart" about our bump-pointer use this time: every memory allocation must be a power-of-two size. F'rinstance, a 6-byte allocation is allotted 8 bytes, an 11-byte allocation is allotted 16 bytes, etc. The downside is that if every allocation is a power-of-two-plus-one then we're wasting about 50% of our heap. The upside is that allocation starts cheap and stays cheap (we don't need to search a cluster****ed free-list for free space, we just need to search for an appropriately-sized list). In addition, we can maintain some semblance of organization - whenever we have two adjacent chunks of free space that are the same size, then we can merge them into one chunk of free space and simply move it to a different free-list. Likewise, we can split free space in half (recursively) to satisfy smaller memory requests. This has the added advantage of improving locality when we use a large chunk of free space for a lot of smaller allocations.

Why god why???

The heap is more expensive to maintain than the stack, but it also gives the programmer more flexibility - but you probably don't care about that. What you care about is that buffer overrun in my previous post. The buffer overrun occurred because non-executable data (the array) and executable data (the program counter) were both allowed to reside on the stack. If that array had been on the heap instead, then a hacker could cause some damage with a buffer overrun, but the risks would be far less severe. We (meaning the AFRL Squad(tm) (but mostly Warto and MilHip)) found and implemented a "heap smash" buffer overrun, and it required that the attacker write several gigabytes of nops to memory. Most people would have turned off their computers after the first few megabytes of nops ("why the hell is my computer hanging? Oh well, ctrl-alt-del")

Timing Attacks pt. 2

Earlier I'd given some vague hand-waving example about how you can use timing attacks to break public-key crypto. Here's a more concrete and more insidious example using SSH tunneling.

SSH (Secure Shell) sets up an encrypted channel between you and your remote desktop/server/whatever. I'll ignore the problem of setting up the initial connection - we'll start from the point that the SSH connection is in place, at which point every character you type on your computer is encrypted with a secret key and sent to the remote computer, which decrypts the character and uses it as input (returning (encrypted) data if requested). Anybody familiar with Telnet is familiar with SSH, as they're basically the same thing except that SSH is encrypted while Telnet is not.

The flaw here is in that part where I said that each character is encrypted and sent across the channel. Let's say that you're SSHing from a (relatively) dumb terminal into your home computer to do some internet banking. You type in your username and password, which are encrypted and sent across the channel one character at a time. Your remote computer decrypts the username and password, logs you into your bank account, and sends you encrypted screenshots (or whatever) of your account summary.

So what's the problem? I'll show you - pull out a stopwatch, and time how long it takes you to type Jabberwocky, and now time how long it takes you to type 1xP_tT@9f;f - when you're finished, report back on which password took you longer to type (hint: it was the second one). So the attacker can figure out right off the bat whether your password contains a lot of entropy (in which case you need to make sure that he doesn't bribe the cleaning staff into faxing over the post-it note containing your password) or relatively little entropy (in which case you're susceptible to a dictionary attack). But it gets worse, because SSH doesn't send a string at a time, it sends a character at a time. Special characters generally take longer to type, adjacent characters take less time, mixed letters/numbers (1s2d) take more time than runs of letters/numbers (12sd), capitalization adds a small delay, etc. The attacker can also attempt to model your handedness, whether you're using an ergonomic keyboard or a piddly little laptop keyboard, etc. The point is that you're leaking tons of timing data, and at that point it just becomes a statistical game of how much data you leak before the attacker can crack your bank password. (The attacker can monitor your remote computer's packets to determine when you're logging into your bank account, so he won't become overwhelmed by data.)

How do you avoid this?

Solution 1: Type left-handed. Only applies to right-handed people; if you're left-handed then you should only type right-handed. By this I mean that you don't use other other hand at all when entering usernames and passwords. This'll **** up the timing data to no end.

Solution 2: Use a fancier SSH client. Put the client in "password mode" or whatever, instructing it to send no more than 1 character every 10 milliseconds, and to randomly insert additional 10 millisecond delays. This ought to sufficiently muck up the timing data.

Solution 3: Keep changing your passwords. The attacker probably won't be able to get your password on the first (or even tenth) time that you log into your bank, but it's only a matter of time. However, if you change your password every e.g. five times you log in, then the attacker is ****ed (assuming your new and old passwords are sufficiently different - "jabberwocky2" is not a good password replacement for "jabberwocky1"). Have a password generation program randomly assign you a password, write it on a post-it, and lock the post-it in a safe.

**Combat Ingrid** · July 23, 2009, 21:19

Best thread ever!

**Lonestar** · July 23, 2009, 21:59

We just plug in a KIV-7M at both ends of the circuit with the same Key and we're good to go at work.

**Felch** · July 23, 2009, 23:23

I actually read a bunch of that. It's kind of hard to respond to though.

**Asher** · July 23, 2009, 23:57

It's too much content to read while I'm not at work. Tomorrow.

**Kuciwalker** · July 24, 2009, 00:22

I don't think you need to read any of that Asher, from what I skimmed it's not anything you wouldn't already know.

**SlowwHand** · July 24, 2009, 00:31

Kuci, you get worse every day. You're actually screening for what is worth Asher's time now? I used to have a little respect for you, but that's out the window now.

**Kuciwalker** · July 24, 2009, 00:31

**Asher** · July 24, 2009, 00:51

Originally posted by Kuciwalker View Post

I don't think you need to read any of that Asher, from what I skimmed it's not anything you wouldn't already know.

Loinburger has a terrific sense of humour and it's always engaging to read his take on things.

**Ben Kenobi** · July 24, 2009, 01:59

Thanks for the thread loinburger.

I'm disappointed that there was not one use of hereuntoforthwith.

**Zkribbler** · July 24, 2009, 02:19

they didn't trust their browser enough to send a company their credit card information over an encrypted connection and so they were going to just email their credit card information to the company instead.

WTF?!?

**loinburger** · July 24, 2009, 07:45

I was trying to send in a credit card reservation to a hotel in Belgium and when I hit return it said that although the form was encrypted it was being sent over an unsecure line and could be easily read by someone else. So I backed down. I don't need my credit card number floating around, right?

I am hoping the hotel gets my reservation sans credit card and emails me and I can then send them that info on an email. The fact that Windows told me "it could easily be viewed" was enough to scare me off.

.

Announcement

Cryptography

Cryptography

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment