Back to Top

Monday, April 23, 2007

Installing and using Truecrypt on Ubuntu


Update: while truecrypt still doesn't offer native packages (ie. .deb / .rpm) for Linux distributions, their shellscript installer works just fine. So the simplified version of the installation procedure is:

  1. Download the correct package from Truecrypt (either 32 or 64 bit - you can find out which you need by typing uname -a - if it says i686 you need 32, if it says x86_64 you need 32 bit)
  2. In the directory where you downloaded: tar xvf truecrypt-7.0a-linux-x86.tar.gz
  3. sudo ./truecrypt-7.0a-setup-x86
  4. Click "Install Truecrypt"
  5. Launch it from Application -> Accessories or by typing truecrypt
  6. If you later want to uninstall truecrypt: sudo /usr/bin/>

While I was upgrading my storage subsystem (I bought two new hard-drives :)) I thought that this might be a good time to go full encrypted for privacy reasons. The solution I selected was Truecrypt since it seemed the only one to offer cross platform support. However the Linux part of it is not complete and you may have to employ a few tricks which I describe below:

Truecrypt does not have packages (yet) for Ubuntu 7.04 (Feisty Fawn), so you have to go with the source distribution. My installation experience was pretty flawless, but others had problem with it, so you might need to google around a bit. What you need:

  • The build-essentials package (sudo apt-get install build-essential)
  • The source files which correspond to your kernel version. You can find out which kernel version you have by typing uname -r at the console. For example I have 2.6.20-15-generic, and the corresponding source package for it is linux-source-2.6.20 (observe that the patch version is not important)
  • The latest Linux kernel is compiled with gcc4, however if you have an older version, you should check the gcc version it was compiled with, since you need to use the same version when compiling Truecrypt. You can do this by typing cat /proc/version at the console. For example the output on my system was Linux version 2.6.20-15-generic (root@palmer) (gcc version 4.1.2 (Ubuntu 4.1.2-0ubuntu4)) #2 SMP Sun Apr 15 07:36:31 UTC 2007. The important part of this is the gcc version ... part. If it says something like 3.4 there, you should install the respective version of gcc (sudo apt-get install gcc-3.4 - the subversion is not important) and make sure that the building process uses the respective version by typing at the console which you will be using to launch the building process the following: export CC=gcc-3.4

Now for the building process (taken from howtogeek and the ubuntu forums):

  1. Download the source code (by going to the download page and selecting Other (source code))
  2. Extract the archive using either the GUI (with Archive Manager) or by typing at the command line tar xvfz truecrypt-4.3-source-code.tar.gz (if you downloaded a different version of truecrypt, you should replace the archive name with the name of the archive you downloaded)
  3. Do the following on the terminal (the same terminal you done the export... step if it was needed - otherwise it doesn't matter):
    cd /usr/src/
    sudo tar xvfj linux-source-2.6.20.tar.bz2 
    sudo make -d -C linux-source-2.6.20 modules_prepare
    Warning! The last step can take a considerable amount of time (up to an hour), so be prepared with some fun games
  4. Now you are ready to install truecrypt:
    cd truecrypt-4.3-source-code/Linux/
    sudo ./
    sudo ./

After installing you can create and mount Truecrypt volumes (including ones created under Windows). Here are some gotcha's to watch out for:

When creating a Truecrypt volume (under Linux), you have to specify FAT for the filesystem. This is needed because Truecrypt does not have an option (as far as I know) to mount the volume as a block device and refuses to mount it if it can't recognize the file system. If you wish to use a more sane file-system (like ext3, reiserfs or ntfs even), do the following:

  1. Create the volume with a FAT filesystem
  2. Mount the volume
  3. Now unmount the filesystem part using umount (not truecrypt -d). For example on my system I would do sudo umount /media/large. To find out the exact parameter you need to pass to umount, do a sudo mount and look for a line which begins with /dev/mapper/truecrypt and use that part after on (for example on my system it say: /dev/mapper/truecrypt0 on /media/large type fuseblk (rw,nosuid,nodev,noatime,allow_other,default_permissions,blksize=4096) and thus I need to use /media/large). If you have multiple such lines, do a truecrypt -l to find out which you need to use.
  4. Use the mkfs to create the filesystem you wish. For example to create an NTFS filesystem, I would do sudo mkfs -V -t ntfs /dev/mapper/truecrypt0
  5. Now re-mount it.

If you wish to mount an NTFS formatted volume in read/write mode, you need to have the ntfs-3g driver installed, and when mounting specify it by saying --filesystem ntfs-3g because the autodetect mode will result in the usage of the read-only ntfs driver. Also the user mount option doesn't seem to work for me, so instead you can use the --mount-options gid=100,uid=1000,umask=000 parameter to make it accessible to all the user. You can find out the number you need to type for gid (GroupID) and uid (UserID) by doing a cat /etc/group|grep user and cat /etc/passwd|grep [your user name] respectively.

Finally be aware that truecrypt gives you the option to specify sensitive data (keyfiles, passwords) at the command line. While this is convenient, doing so will give huge clues to any decent attacker, because the command line is stored in the ~/.bash_history file, effectively giving away your passwords. Now you can clear you history file by doing a history -c, however the strings are still on your hard-drive in the slack space. The best thing is not never specify these things at the command line and let truecrypt prompt you for them.

Update: if you don't want to move around your mouse when creating a new volume (to generate random numbers), just put --random-source /dev/urandom on the command line. While this reduces the theoretical strength of your encryption, in practical terms it doesn't affect you.

Update: as a reader pointed out in the comments, there is a simpler way to use a file system different from FAT: after creating the volume, the first time you mount it, don't specify the directory where it should be mounted. This will mount it as a block-device, but will not attempt to use any file-system on it. Then issue the truecrypt -l command to see where it got mounted and use mkfs family of commands to create a filesystem of your desire.

Sunday, April 22, 2007

Mixed links


A new Ethical Hacker Challenge is on.

X for Windows without Cygwin!

GreatFireWallOfChina - test any site and see if it's blocked in mainland China - via OffTheHook

Via the All About Linux blog: a very fun (and very addictive!) flash game: Desktop Tower Defense.

Cleaning it all up - temporary files in Perl


One of the most frustrating things in programming is doing all of the extra plumbing. You can just say (if you are trying to create a stable product): open file A, read a line, transform it and dump it to file B. You have to think about all the error conditions which may appear: what if file A can't be opened? what if file B can't be opened? what if the read line is invalid? what is the rollback strategy for every possible situation? This is why things like Ruby on Rails or garbage collected languages are so popular: because they take care of the details in the background, without the programmer having to explicitly think about them.

One detail which comes up frequently in scripts is the creation of temporary files which should be removed once the scrip terminates. One easy (and cross platform) way to do this is to use the File::Temp module or you could write an END block. However both of these methods fail to delete the file if at the moment when the cleanup is being performed the file is being read by an other process. The Windows API offers a very nice way of dealing with such a problem: the FILE_FLAG_DELETE_ON_CLOSE flag for the CreateFile API (which is the function used to create / open files - think of it as a lower level version of open). What this flag does is it tells Windows that the file should be deleted when the last handle to it gets closed. This means that we can open it from multiple processes and when the last open handle is closed, the operating system automagically deletes the file. Below you can find two code samples, one for creating the file and one for reading it from an other process. The drawback of this method is that its not portable and that you have to use the Windows APIs to access the files, because open can't produce the required flags.

use Win32API::File qw/:Func :FILE_FLAG_ :FILE_SHARE_/;

use strict;

use warnings;

my $fh = createFile("file.tmp", 
"wc", "", { Attributes=>"t", Flags=>FILE_FLAG_DELETE_ON_CLOSE, Share => FILE_SHARE_READ });

WriteFile($fh, $contents, length($contents), [], []);

my $fh = CreateFile($fileName, GENERIC_READ, FILE_SHARE_READ | FILE_SHARE_WRITE | 4, [], OPEN_EXISTING, 0, 0);
ReadFile($fh, $buff, 8192, [], []);

Two more interesting things: the OS takes care of closing the handles when the process exists, so there is no need for doing it in the write example. And whenever the API documentation says that NULL can be passed in for a particular parameter, the corresponding Perl construct is a reference to an empty array ([]). This was probably chosen because undef's are stripped from lists (meaning that (1, undef, undef, undef, 2) is (1, 2), so we can't tell at which positions were the undef's located).

And also an useful link: Precompiled regular expressions from the Windows Perl Blog.

Saturday, April 21, 2007

Hack the Gibson #88


Read the reason for these posts. Read Steve Gibson's response.

A question which popped up twice in this episode was the problem with broadband user and the answer provided was very good: even if 50% of the people who have broadband would to turn off their connection when they are not using it, the other half still pose a very big (and ever growing as we get more and more bandwidth world wide) risk. There are some things ISPs can do, but ultimately it comes down to user education. I'm in favor of a required computer drivings license (because computers are clearly much more complex than cars and you can inflict big - although mostly material not physical - damages if you're not knowing what you're doing - but you can imagine even extreme cases when you're computer becomes part of a botnet attacking a hospital which looses pacients because it's network becomes unusable) or managed security, but these are not things I would consider becoming real very soon, given how the marketing departments of IT companies want to convince that computers are simple, everybody can use them.

To the listener who discovered XSS flaws in its banks websites, it would recommend first of all to report his finding anonymously, and second of all, watch two great videos from Shmoocon 2007 (better yet, watch all of them since they all are great, but these two are pertinent to the matter at hand:

  • Assess the security of your online back without going to jail - Chuck Willis
  • Vulnerability Disclosure Panel Palaver (or 0-day OK, No Way, or For Pay) - Katie Moussouris

Regardin the Vista / UAC problem: I don't use Vista, so maybe I misunderstand the problem, but there are possibilities of elevating a program explicitly (most probably they are still present in Vista) without logging out and logging back as a privileged user. On earlier versions of Windows running unprivileged and privileged programs on the same desktop was not 100% safe, because of the possibility of the privileged program being vulnerable to the shatter attack, however that is more of a theoretical vulnerability which wasn't used at all in real malware as far as I know and the message system in Vista was redesigned to remove the possibility of this happening.

To the fellow wondering if malware can't re-enable feature he disabled: it can and many times it does. This is why running as limited user is such an important things. Programs running under limited accounts can not do these things. However programs running with high privileges can do anything you can (including stopping your AV product or firewall). This is a point that companies and security professionals tend to forget to mention (the reasons of companies are clear, however I don't know why there is so little discussion in he security professional circle about this topic). The fact that most programs won't run in low rights mode is entirely false: I've been running with low rights on Windows XP and 2003 for the past year of so and had very few problems. And products which are rumored to have the most problems (like developer tools - MS Visual Studio including 6.0) worked flawlessly.

And here is a little plug (because Steve also plugged his password generator): when you use mine, you don't have to trust anyone. In Steve's case you have to trust him that he's not keeping a log with every IP and the passwords which got generated for it. Now I'm not implying that he is, I'm just saying that he could be. If you choose my solution, you can inspect the source code to make sure that nothing funky is going on.

About malware detection and cleaning: Windows offers a very limited set of tools to properly diagnose the health of the system. The best thing you can do is to watch this presentation by Mark Russinovich (co-author of Windows Internals!) about the topic.

Regarding question nine: while technically correct that you can't prevent tunneling trough SSH but enable SSH at the same time, there is something you can do: enable outgoing connections only to a limited set of addresses (which are probably work related), and make sure that those SSH servers are configured not to do forwarding (and also make sure that the user accounts given to your employees don't have enough rights to change the configuration).

To the admin who got caught by the Webmin vulnerability: configure your firewall to be as restrictive as possible with both inbound and outgoing connections. Limit the access to important files through other means too (like .htaccess files in Apache).

Regarding the question of the student with IP/ShieldsUp Leo kept mixing public and static IP (btw, the student seemed also confused about it). You can have one or the other. They have no realtion to another (in fact you can have public/private and static/dynamic IP addresses in any combination). Also buying a router doesn't help much if it doesn't get properly configured (because otherwise you have a public facing administration interface with default passwords on many routers). And if you are knowledgeable enough to configure the router, you are probably knowlegable enough to secure your computer without it.

Thursday, April 19, 2007

Hack the Gibson #81 to #87


Read the reason for these posts. Read Steve Gibson's response.

Here is again a long overdue post about the recent Security Now episodes. I have to say that the quality of the information provided in the recent episodes deteriorated (or maybe it is that they started talking about more concrete things where the errors are also more concrete).

First of all a word about Steve Gibson's magic tool Spinrite: in all the episodes I've heard Steve and Leo never mentioned once the fact that failing harddrives should be replaced after successful recovery. To be fair I've heard from other sources that they were occasions where they made such recommendations, however I never personally heard such a thing. In my opinion it is irresponsible to let people using failing hardware which puts their data in continuous jeopardy and not giving them the proper advice: back up, get a new hard drive and throw away the old one (after securely wiping it with something like dban). Then again I don't know what some people are doing with their hardware, since in the 15 years I've been using computers (knock on wood) not once had I a HD die on me or any of the people I know personally. Now back to our regular schedule :).

Episode #81 was pretty uneventful if you don't consider the few times here and there where Spinrite was pushed, of which I disapprove not because I think that it is a bad product (given that I can't test it since there is no demo / trial version available) or because I think that podcasts should be commercial free, but because in my humble opinion they fail to mention the disclaimer regularly enough: if you have a failing drive and by some wonder you manage to get it working make a backup and get a new hard drive! If you want to hear an informative podcast about data recovery, I would recommend this episode of Cyberspeak instead.

Episode #84 also has some Spinrite praising at the beginning, this time a corporate server was resurrected allegedly, which is cool, however there is again no mention of replacing the failing hard drive. A company which wouldn't cough up the money for a new hard drive for the server when the old one is dying... I really have no words for it. Also immediately following it there are Leo's comments on how he is recording to a drive fixed by Spinrite.

The first omission comes from the guy asking about moving his SSH server to a non-standard port as a security measure. While moving ports is a very good addition to security (and I recently recommended it), however there is also an other very good security measure: firewalls. Does really everybody in the world need access to your SSH server? Restrict its availability as much as possible. Ideally you would restrict it to a couple of IP addresses, but even if you have larger needs, at least restrict it to some (even if large) subsets of IP (you can do this with the built-in firewall starting from Windows XP). There is really no need for the Chinese government to be accessing your servers.

Getting to the Javascript question: the original name of Javascript was Livescript and it got the name Javascript as a marketing gimmick. If you want to lear about Javascript from the masters, wisit the YUI (Yahoo! User Interface) theater and wath the videos of Douglas Crockford. They are gold. Getting back to the supposed problems of client side scripting: many problems (both security and usability) are not created by Javascript and because of this they don't go away magically when you disable it. It is possible to build sites which are usable without scripting enabled by subscribing to the design principles of progressive enhancement and Hijax (in fact it is very advised to do so to support alternative platforms like screen readers or mobile browsers). Also it is possible to perform many attacks (like reprogramming your router) without Javascript. You simply have to include a hidden image / iframe tag in the page which results in the corresponding request being sent to the router. Turning off javascript is security much in the sense as using Linux is - because it represents a very small percentage of the market, there are few attacks directed at it. While security by obscurity is not necessarily a bad thing, one should consider its deficiencies and not rely on it exclusively. Also, the web originally was meant to be a read/write medium (contrary to Steve's comments) and only because of technological restrictions got to be read-only for the beginning. Also, Javascript is as sandboxed (if not even more) as Java and Google mail runs both Java and Javascript - Java as a back-end solution with their GWT and Javascript as a front-end solution. Again, many of the problems is not caused by Javascript and would still be exploitable if it hadn't been for it.

Episode #85

The first error is near the beginning :), where Steve's says:

That is, you know, any time you’ve got current running through a wire, you generate a magnetic field around the wire. And when the frequencies are high enough, that ends up generating radio frequency emissions, which of course travel much greater distances than magnetic emissions.

The correct statement is: current running through a wire generates an electro-magnetic fields, which, when it has a high enough frequency (like in a radio-frequency band) can travel pretty far (both the electric and magnetic components - which actually are perpendicular to one another). Highschool physics in Romania.

Also the podcast doesn't do a very good job of explaining what SQL is and what is the realtion between SQL (the language) and RDBMS products like MySQL or MS SQL. In short: SQL is a language (in fact an ANSI standard) and products like MySQL, Postgres, MS SQL, Oracle and others are data base server which use the relational model to store data and data about data (meta-data) - hence their collective name - RDBMS or Relational Data Base Management Systems and SQL as the language for manipulating it.

From here on this show goes downwards with statements like Jikto requires a Nikto server to run (it does not). You are better off watching the entire video for yourself. It is very informative and somewhat scary but the press (and this podcast) certainly managed to overhype it. They also manage to mis-identify the problem. The problem is not that there is no separation between textual displayed content and executed scripting content. The problem is that the security model of the browsers (and you can't blame Microsoft or Mozilla here, because the problem is in the original standard) has some loopholes. The good news is that now that they get to light, they will be fixed (although slowly).

Episode #86

Steve can't figure out what Blink is: It sounds like it’s an antivirus. It sounds like it’s antispyware. It also sounds like it does some Internet security. It sounds like it would be a very good choice for a lot of people. This is surprising for someone who explained ASLR (Address Space Layout Randomizatio). Blink basically brings ASLR and some similar technologies to Windows (which btw. are since long available on other operating systems) to prevent exploits from running and it does it very well both because there are some very smart people working on it and because it has a small market share not specifically targeted by exploit authors. This is a very important pillar of malware protection: if you can get your environment sufficiently different from the average users environment (for example installing Windows in different directories, running as limited user, etc), you are protected from 95% of the malware out there which target high numbers and don't make any efforts to dynamically adjust to different or more restrictive environments. This could (and probably will) change as such methods become more and more widely used, the idea is to stay ahead of the curve (and remembering that security is not a point in time, it is a process).

Also the statement IE7 with its enhanced protection turned on was not vulnerable to this is incorrect. IE7 was vulnerable to it, the only thing the exploit couldn't do was to write files (to install it or some of its components permanently), however it could read files and communicate their contents back to the server.

While there are some good remarks in the podcast, like As I’ve said before, as a developer myself, the psychology of the developer is, oh, can I just get this thing working? I just want to be done with this. I want to get this working because my boss is breathing down my neck, and I’m already three weeks late. And so there isn’t that awareness., it is confusing, lacking in details and provides no real solution. If you want a more accurate description of the problem without all the FUD (Fear, Uncertainty and Doubt), go listen to this podcast.

Episode #87

First an update about the protected processes: a clever guy (originally from Romania :)) already figured out how to remove the protection.

The rest of the podcast tries to discuss SQL injections, but falls in the same (hopefully unintentional) trap as the previous one: makes some mistakes which could confuse beginner listeners, overstates the possibilities and gives no defensive solutions. I would recommend reading my previous post, not because I'm so much smarter, but because I tried to focus on these issues and at the end of it there are links to people more knowledgeable in these matters.

The end (momentarily).

SQL injections - what they are and how to avoid them


SQL injections are a subtype of the larger category of command reparse vulnerabilities. These attacks work because there is an intermediate language between different components of the system, more specifically between the frontend (which is tipically a webserver giving access to the whole world) and the backend (which is hidden behind a firewall / NAT and supposed to be protected by the access checks in the frontend).

To better understand the matter, take the following scenario: you have a website which allows listing of certain directories on the computer. Lets suppose that it works the following way:

  1. User goes to a webpage
  2. User enters in a form the directory s/he would like to view the contents of
  3. The server side script takes the directory, executes the ls (or dir, if it's a windows system) command, takes the output and sends it back to the user

This is usually how we think about the execution flow, however the devil is in the details as they say: the last step actually includes two very important substeps:

  1. The server side script takes the director and
    1. Constructs a command line to be executed and passes it to the shell (for example bash or cmd.exe)
    2. The command interpreter parses the command and executes it
    3. The output is sent back to the user

The types of vulnerabilities are created because of a mismatch between how the commands / queries are constructed and how they are parsed. For example let's say that the command is created by appending the directory to the string ls (so if the directory is ~/, the resulting command would be ls ~/). Now if somebody would to type in ~/ && cat /etc/passwd", the resulting command would be ls ~/ && cat /etc/passwd which would print out not only the directory listing but also password data about the users. The fundamental problem here is that at the point of the construction we are not aware that some characters / character combinations (like the && sign) have special meaning for the command interpreter (in this case it means: execute the first command and then execute the second command).

How can these types of attacks be mitigated generally (later on we will cover the special case of SQL injections)?

  • First of all try to avoid going through the additional encoding / parsing step if possible. This means that before calling out to an other system, check if you can't do the actions with built-in functions (all server-side scripting languages have for example functions to obtain directory listings). Using built-in functions you get both a performance gain (because creating a new process takes time as does the construction of the command line / query string and the parsing of it) and security gain (because you can use a much clearer - from the computers point of view - language when communicating with built-in functions: parameter lists, where it is known in advance whichg parameter represents what and there is no possibility of confusing parameters and commands. Additionally there is very little chance of executing one method and ending up doing a different action - like in the example where we wanted to get a directory listing and ended up typing out a file - unless there is a serious vulnerability in the scripting language and there aren't many of those)
  • Second of all: if you have to call out to other systems, make sure that you escape the special characters in your query. Escaping means marking them for the parser at the other end so that the parser knows that it's meant to interpret the given characters literally rather that with their special meaning (for example in the previous case if we would have issues the command as ls ~/\ \&\&\ cat\ \/etc\/passwd - where we marked with a backslash that the characters space and double ampersand are meant to be understood as literal characters rather than argument separators or command concatenators). Do not use homebrew methods for doing the escaping, because the parsing rules are complicated and chances are that you will miss some exceptions (and you will waste a lot of time writing the code). Use the built in functions of your language and use the specific functions rather than the general ones and use the ones provided by the libraries created to access the specific resource because chances are that they know best the syntax of the language (for example in PHP use escapeshellarg for command line parameters and mysql_escape_string when accessing a MySQL database rather than the generic addslashes. Chances are that there are additional syntax rules for the given type of strings and while they might be similar to some generic group - like the C style strings - the are not identical to it)
  • Restrict the privileges of the called command by external means (like running it under a restricted user account). This way even if some things evade your other security measures, they won't be able to do much damage.
  • Filter your input and use whitelisting (specifically look for and only allow known good elements rather than trying to eliminate all the bad things - because you have a greater chance of missing something when trying to enumerate all the bad things, given how they are many of them, than when trying to enumerate the good things). If you are using regular expressions, look out for this gotcha.

Now lets look how this all applies to SQL injection: frontends use SQL to communicate with the backend database. SQL is a nice abstraction because rather than having to learn a different set of methods for each database, you have a set of common commands and you have to learn only the differences (of which there are plenty, but they only manifest themselves in complex operations like creating tables and the syntax for basic operations like selecting, inserting, updating and deleting data is pretty much the same across the board). One (very bad) way of creating SQL commands (also called SQL statements) is the one discussed earlier where we treat the command as one big string and insert the parameters in it with concatenation. To give a concrete example, lets say that we have a login form where the user supplies the username and password. A query might look like this:

SELECT user_id FROM users WHERE name='$username' AND password='$password'

In the above examples $username and $password are placeholders where the strings supplied by the user are inserted. Now lets suppose that the user enters the following strings for username and password: ' OR '4'<'5 and ' OR '6'>'2. The resulting query would look like this:

SELECT user_id FROM users WHERE name='' OR '4'<'5' AND password='' OR '6'>'2'

This would allow the user to log in even when s/he doesn't have a valid user account. Even worse, let's suppose that the user entered the following strings in the textboxes: '; DELETE * FROM users; --. This would result in the following query:

SELECT user_id FROM users WHERE name=''; DELETE * FROM users; --' AND password=''

Now we just convinced the database to delete all the entries in the users table (-- means in SQL comment, so everything which comes after it is ignored. This is a convenient way to eliminate the rest of the query so that the attacker doesn't have to make sure that the rest of the query is syntactically correct).

Now that you know what SQL injection is, how can we protect against it?

The best solution is to use prepared statements. A prepared statement is an SQL query where we place a marker in the places where user data will be (usually the question mark) and let the parser pre-parse it. In our case this would look like the following:

SELECT user_id FROM users WHERE name=? AND password=?

At the second step we bind values to those locations, but because the parser already knows that what we are supplying is user data, it won't mistaken it for SQL commands.

If you don't want to rewrite you whole database access, make sure that inputs are properly filtered (if they are supposed to be integers, make sure they are integers and so on - make sure that you check out the tip mentioned above if you are using regular expression for this) and escaped before they are fed to the database. However you should gradually migrate to prepared statements because they are more secure and also they make your code easier to read.

A third very important step would be to connect to the database with a user which has only the privileges which it needs (eg. not the root user). For example it can select, insert, update and delete elements but it can not access other database, create tables, lock tables drop tables, create other users, etc. Even better, create different users for different use cases (for example a normal user may only select data, a registered user may select, insert and update data and an administrative user may select, insert, update and delete data) and use them to connect in the different scenarios. You may also implement logging (very easy to implement if your RDBMS supports triggers) in a table to which none of the users have access to. While later suggestions are complicated and hard to retrofit to existing systems, the first one is a definite must!

Also some database systems (particularly recent versions of MySQL) don't allow multiple SQL statements in one query. This means for example that the above example (the one deleting all the user data) won't work, however you still have serious information disclosure vulnerabilities and all it takes is a legitimate DELETE statement to remove all the data.

Finally a very important thing: don't store more data than you have to. Read about hash functions and learn to use them. The main idea of hashes are that it is very easy (computationally speaking) to calculate the hash of a given text but it is very hard to find the original text given its hash. This means that they are very good in verifying equality of data, without actually storing the data. As an added bonus they have a fixed width for any text, so that you can very precisely plan the space requirements of your database while simultaneously allow for your users to enter arbitrary length data. For example the following method of storing passwords is very secure: instead of storing the password store, store the hash of the following string [a long - meaning 32 characters minimum - random string which is hardcoded in your application] + [username] + [the password supplied by the user] (where + stands for string concatenation). Now when you want to authenticate a certain user, you just calculate the hash of the string [random data] + [username] + [supplied password] and check if it's the same as the one stored in the database. The advantages of this method are:

  • If someone gets access to your user data, s/he cant directly find out the passwords (revealing the passwords is very risky because most users use the same password at multiple sites and one such compromise might lead to a chain of compromised accounts). This is true even for internal people also - like system administrators or database administrators.
  • It eliminitates the possibility of guessing the password using rainbow tables, which are basically huge tables storing entires like [text], [hash of text] for fast lookup (meaning that if you know the hash you can get the text simply by doing a lookup in the database). However such tables grow very big very fast, so the biggest out there is under 16 characters as far as I know. However because you prefixed each string with a 32 character string before hashing them, you can guarantee that even the weakest password is at least 32 characters long, removing it from such tables.
  • If the attacker would to generate such tables for your database (assuming that s/he has both the database and the random string), s/he would have to generate a different table for each user because their username is also hashed in.

The downsides are: you can't send out the password to the user if s/he forgets it, however there are alternative solutions, like sending a link to the user from where s/he can reset the password. Of course, these links should time out after a reasonable amount of time (6-12 hours!). Also weak passwords are still vulnerable to bruteforce attacks, so you should encourage your users to chose strong passwords.

Update: you can play with SQL injection attacks (to get a better feel for them) on the examples provided by Foundstone (now owned by McAfee) - click on the SASS tools - or listen to the corresponding episode of the Mighty Seek podcast.

Thursday, April 12, 2007

Active vs. Reactive protection


Hello all.

I want to bring to your attention the following article written by fellow blogger Kurt Wismer: defensive lines in end-point anti-malware security. I especially like it because it puts AV technology in place and creates a good foundation to start any meaningful debate.

Here are my opinions on the matter (in no particular order):

  • All the technologies enumerated in the post can be categorized either as active or as reactive. Content filtering is reactive (event with heuristics - see my point below) while application whitelisting is active. From a security standpoint active technologies are preferred over reactive ones, however they usually result in reduced usability (if you state to the user that s/he can use only a certain set of applications, there is a very big chance that s/he will be dissatisfied).
  • Heuristics isn't "magic which can catch unknown malware". It only means that software can catch a large category of malware generically (for example all the programs which try to access Device\PhysicalMemory directly to hide themselves). It catches unknown malware in the sense that the given sample was (possibly) never seen by a human analyst, but it is still based on known principles. Because of this, every heuristic solution can be defeated by (a) using unknown techniques (b) not using a given technique or (c) obfuscating the usage of the technique.
  • Given the above facts, the Consumer Reporst debate is meaningless - or meaningful in the restricted sense that it tries communicate the message which was always known in the AV industry: it is always possible to create undetectable malware (and malware authors can simply do this by iterative development - creating a variant and testing it against AV products, modifying it, testing it again and so on until they don't detect it). This is a fact which is often tried to be avoided by the marketing department of AV companies who would like to give the impression that you can buy total protection for your money.
  • This is also why you are better off choosing an AV vendor different from the big two: malware authors usually test their products against them and don't bother to try to avoid the detections of the smaller ones (because it doesn't make sense from a business stand point - if the client has a certain type of AV with a high probability it is enough to evade the detection of that product to infect a considerable amount of computers)
  • Given its reactive nature the two things you can test meaningfully when comparing AV solutions are (in this order): (a) reaction time and (b) the programs ability to clean up after the infection. The flow of events usually is as follows: malware gets developed -> it starts to spread -> it spreads to a statistically large enough user base for the AV company to get samples -> signatures are developed and distributed -> the infected machines are cleaned. Because of this it is very important to run all the programs with the lowest possible privilege so that they can't subvert the AV solution before it gets a signature.
  • Most AV products today try to offer additional features (like firewalls or network traffic filtering) to defend against attacks which are not part of the traditional file based security model (for example exploits which travel in the network and never touch the disk, or when they touched the disk it is already too late because the browser executed the code and they are in the browser cache). However these solutions are also signature based and reactive (which is not necessarily a bad thing, but one must keep it in mind when evaluating such a solution).

And finally: for static environments (like companies or home users with a limited set of needs) whitelisting is the way to go. Unfortunately this approach has not enough marketing money behind it and is to complicated for the home user to implement (if s/he would know what applications are safe / not safe we wouldn't have this problem in the first place). For home users managed security would be the way to go, but since there is no user awareness this too remains mostly reactive (in the sense that you call somebody after your computer breaks, not before to prevent it).

Monday, April 09, 2007

Short news



The Shmoocon 2007 videos start to appear.

A hacker challenge for the conference is still online, so that you can give it a try. From what I saw it is very nice (needs all kinds of different skills from overflowing buffers to writing sql injections)

Sunday, April 08, 2007

AOL Bullying Gaim!


This is deeply troubling:

AOL is forcing Gaim to change its name

Please kindly contact AOL, and bring these points to their attention:

  • The users of Gaim are highly technical and the probability of them making a confusion between AIM and Gaim is infinitely small
  • Searching for AIM on search engines (like Google, Yahoo or does not bring up results with Gaim on the first page (in fact it consistently gives the official AIM site as the first result), so there is no risk for the average user to confuse Gaim with AIM
  • Gaim has alredy changed its name once at the request of AOL
  • Gaim (the project) does not make any money off the fact that their connect to AOL's network. In fact it is run by a volunteer community who donate their time to the project.

Having all these points in mind, the action of AOL is both immoral and without sense from a money stand point. It can only be classified as meaningless bullying on the open source community!

Please let this be an Aprils fools joke (although it really doesn't seems to be since it was posted on the 6th and who would joke about legal action)

Update - the story got posted on Slashdot and here are the most informative comments:

The facts laid out by the Gaim developers were:

  • GAIM had the name first
  • AOL forced them to take the name GAIM because "GTK + AOL Instant Messenger" was too infringing.
  • When AOL decided to trademark AIM, GAIM became too infringing
  • AOL systematically and repeatedly harassed the developers until they gave up

It's not Pigeon - it's 'Pidgin', which refers to a number of English-derived dialects spoken in Vanuatu, Papua New Guinea and the Solomon Islands in the South Pacific. The language is simple in construction and has a very limited vocabulary, but it can be quite poetic. I speak Bislama, the Vanuatu version of the language, which contains elements of French as well as English. The syntax is very much like English (subject - verb - object), but its idiom is derived from the hundreds of local languages. I don't know whether the team were aware of this when they chose the name, but Bislama and the other South Pacific Pidgins are spelled phonetically, which makes it really easy to understand. Example: Mi wantem toktok long yu Means "I (me) want to talk to you." This phonetic spelling makes it absolutely ideal for texting, because there are few if any of the crazy English spellings that stretch on forever without adding anything to the word - 'thought', for example, is simplified to 'ting'. When SMS was recently introduced into Vanuatu, even expat folks like myself found ourselves texting in Bislama, because it's more concise. So with all that in mind, I'll simply say, "Mi ting se 'pidgin' hemi wan gudfala nem blong givim long kaen software olsem. Smol tingting blong mi nomo.'

"Pidgin" is actually an adjective describing a simplified combining of languages, not a specific language family. There are pidgin languages spoken all over the world combining many languages, not always English. Many pidgin languages are named some variation of "Pidgin" but they don't have exclusive claim to the title. More information here:

There is no such thing as "Intellectual Property". It is propaganda. There are copyrights, patents, and trademarks. They are very different from each other. Anyone using the term "Intellectual Property" to group the three of them is either confused or is trying to mislead others. Watch This speech [] by Richard Stallman. Warning: it's 2 hours.

Well, I often have a problem with that too. People of other religions often assume that *my* religion requires a kind of faith similar to theirs, and that it affects my life in similar ways to theirs. When in fact different religions often have strikingly different effects on the societies in which they exist: for example, it's often said Islam encourages a confluence of spiritual and temporal authority, while in most Christian-majority societies this has rarely been the case since the Reformation. But I digress... A lot of FLOSS people despise the term "intellectual property" since it's often used intentionally to confuse people, by encouraging the belief that trademarks, copyrights, and patents give the same kinds of monopoly rights. When in fact, this is far from true. For example, Linus Torvalds holds the TRADEMARK for the name "Linux". But he does not hold the copyright for most of the code in the Linux kernel, since most of it has been written by other individuals and companies. And IBM may hold the patents on some algorithms used in the Linux kernel, but again this does not mean they hold the copyright for all of the code. None of this is a problem as long as no one is suing anyone. But then we get ass clowns like SCO or Microsoft who come along and make threats about how "Linux is infringing on our 'intellectual property' rights." That frightens a lot of users needlessly, and it's complete bullshit unless they care to specify exactly what rights they are talking about: trademark, copyright, or patents. All have COMPLETELY different repercussions. The FSF are totally right to deplore the use of the term "intellectual property" in my opinion. It is meaningless except as FUD.

Also many people bring up the Lindows vs Microsoft action, but the interesting fact is that Lindows didn't loose, they willingly gave up the name.

They also switched to monotone for version control and they will start accepting donations, as can be seen from the following quote (previously this was not possible):

Next I'd like to address paying for Pidgin. In the past this was not possible for numerous reasons, including taxing and trusting individual people with the money. Now, however, when the infrastructure is in place, anyone who wants will be able to "pay" for Pidgin by donating to the project and the Instant Messaging Freedom Corporation. Just be patient a bit longer and such things will be in place so anyone who wishes to contribute money may do so.

Friday, April 06, 2007

Securing the Internet


There is a great series of articles over at the matasano blog about the deficiencies of dnssec. While I have no deep knowledge of the matter, the series seems to bring up very valid points against this security feature (the most near to my heart being the CPU cost of cryptography - which is expensive even on modern hardware - which is good because it means that it's harder to break). Here are the links to the currently available articles (two more are to come):

What I want to write about today (or better yet, plea to the ISPs) are two things:

  • Egress filtering - this means that router check the source IP of the packets which leave their perimeter and drop any packet which has an invalid source IP (meaning that it has an IP from a subnet not associated with that given interface). If all ISPs would perform this filtering IP spoofind would be greatly reduced (because it would be possible to spoof an other IP from the local subnet) and it would lead to better traceability (in case of a DDoS for example). Such filtering could (and may already be) deployed without any impact on legitimate applications. I can think of only two possible problems with it: routers needing additional configuration for each interface (but maybe there are intelligent routers which can perform egress filtering based on routing tables) and routers needing additional processing power for filtering.
  • The filtering of the SMTP port on consumer networks (except for the SMTP server of the ISP). This would greatly reduce spam. There is no reason for the average consumer PC connecting to other SMTP servers than the one of the service provider. The only case when a user would need to do such a thing is when s/he has a mail account in a different place (for example at work or at an other service provider). These rare cases could be resolved very simply and without burdening the helpdesk too much by offering a web page (of course with printed description of it given to each user at connection time) - available only the customers of the ISP - which would permit to add one exception at a time for the rules involving the IP where the request came from (so that one customer can't add exceptions for other customers) and a specified host. The submission should be protected with a CAPTCHA.

It is my opinion that these two measures, if implemented by the wast majority (preferably all) IPSs, would greatly reduce the malicious activity on the Internet.

My submission for The Ethical Hacker Skillz Challenge


The submission date for the 8th ethical hacker skillz challenge is over and I'm eagerly awaiting the results (which should be published any day now). Until then here is my version of the solution, maybe somebody finds it useful someday:

  1. What is the significance of various numbers in the story, including the speech patterns of the goose and Templeton?

    Both the goose and Templeton have a a tendency of using larger words, accomplished in the case of the goose by repeating parts of the words or by using longer and longer words (in case of Templeton). The series of number which represent how many times a given syllable was repeated, respectively how many characters the words contained is 2 3 5 7 11, which are the first five prime numbers (number divisible without a remainder only by one and themselves).

    As for the two prices (1,618.03 and 2,718.28), I have no idea, but I have found this PDF, which seems to represent some price list where both numbers are the prices of different Mercedes models.

  2. How had Charlotte and the Geography Ants fooled Lurvy's integrity-checking script?

    They created two files (t1.html and t2.html) which had different content but the same MD5 hash. This was possible because of the research done in this area made finding of two such streams of data (usually referred to as collisions) rather simple for the MD5 hash. One interesting aspect that one can note is the fact that because MD5 (and usually all the other hash algorithms) process one byte at the time, it is enough for the attacker to generate two different headers with the same size which have the same hash and if the following data is equal, the resulting hash will be equal.

    For example if we have the following two streams:


    if MD5(AA...A) is equal to MD5(BB...B), then MD5(AA...ASS..S) will be equal to MD5(BB...BSS..S), regardless of what the sequence SS...S contains. In this concrete situation, the AA...A and BB...B part were the two colliding byte sequences originally published in the 2004 paper by Xiaoyun Wang et. al. entitled "Collisions for Hash Functions - MD4, MD5, HAVAL-128 and RIPEMD" and what followed was a javascript which basically said: if the first part contains the strings (in C notation) "\xC2\xB5\x07\x12F" (where \x?? means that the ascii code of the character is ??, where ?? is a hexadecimal number), display a variant, and if it doesn't, display an other.

    The modus operandi was to place the first file (AA...ASS...S in our example) on the site, which outputted the first message and the swap it with the second file at the right moment (BB...BSS...S), which had the same size and MD5 hash, but displayed the other message because the bytes in its "header" were different.

  3. Why did Charlotte have to change the website before the integrity-checking script ran for the first time? Why couldn't she deface it later?

    Because the research done in this area only showed how to generate two data streams with the same hash, not how to generate a data stream which has a given hash. If Charlotte would have waited for the integrity check script to run for the first time and to create a baseline hash, she would had to solve the second problem (given a hash, generate a file which has that hash). This problem, while theoretically possible (because we are mapping an infinite number of possible data streams to a finite number of possible hashes - so there has to be collisions), it is computationally much more expensive.

  4. How should Lurvy's script have functioned to improve its ability to detect the kinds of alterations made by Charlotte?

    The script should have compared the hash of a local version file (the original file, before it was uploaded) with the hash of the remote version. Also the script should have used hash algorithms which have no known weaknesses (like SHA-256, SHA-512 or WHIRPOOL). An even more secure solution would have been to compute all these hashes and alert it at least one of them didn't match. This wouldn't have eliminated the problem (because again, we map an infinite amount of possible data streams to a finite number of hash values), but would have made it computationally very expensive. Of course the best thing would have been to download the file from the website and compare it byte-by-byte to a stored version of the file. Given the small size of the file and the fact that it already had to be downloaded (fro the MD5 hash to be computed) this would have caused no performance problems and it would have provided a 100% secure way of making sure that the page wasn't changed since the baseline was created.

  5. What was Charlotte's proposal to Lurvy for saving Wilbur?

    The file counterhackreloadedsteg.png contains an embedded Microsoft Word document (it was embedded with the Digital Invisible Ink program mentioned in the hint and the password "baaramewe" (without the quotes) - corresponding to the letters in red in lower case - with the following content:

    Charlotte’s Proposal

    Dear Mr. Lurvy,

    Now that I have got your attention, I have a proposal for you. You are obviously a bright businessman, trying to make some money on the sale of Wilbur. But, surely you must recognize the fleeting nature of that one-time sale. I propose to you a better business model, one that can keep this farm profitable for years to come.

    Employ me, Charlotte the Spider, as a web site designer, contracting my services out for $150 per hour. I will charge you only $ 50 per hour. Thus, working only 40 hours per week, I can bring in more cash for you every single week than the one-time sale of Wilbur. If you are interested in my offer, please send e-mail to, with the subject: CHARLOTTE’S PROPOSAL.

    Yours truly,


Final note: while the method in this story interesting and demonstrated the concept of hash collisions, it has two major drawbacks:

  • it breaks horribly if the target doesn't have javascript enabled
  • it is very clear that something is wrong even after a brief examination of the page source.

In a real world scenario most probably a more stealthy method should have been applied. For example, linking to an external script file - placed on a server controlled by the attacker - from a script tag which was included / appended to the document. This method resolves both of the problems mentioned earlier:

  • it degrades nicely in browsers which don't support javascript or don't have it enabled - it will show the original page
  • a cursory examination of the page source might not reveal the source of the problem

Further advantages:

  • the file needs to be modified only once, all subsequent modifications will be performed in the linked script file. this means that as long as the modification is performed before the baseline for the integrity check is created, we have successfully bypassed all enumerated integrity check methods, except those which kept an original version of the file before it was uploaded and used it as baseline for the integrity check.
  • the javascript can be generated dynamically, so for example it is possible to show the original version for some originating IPs and the altered version for other IPs, making the discovery of the fact that the site was modified even harder.

The linked javascript would include the code which would generate an IFRAME pointing to an arbitrary page and overlay that over the original page. The fact that it is an IFRAME ensures the fact that the url in the address bar wouldn't change (which could raise suspicion).

Linux tips


Via the All About Linux blog: bash completion - if you type ls -- in your terminal and you tap the tab key twice, it will list all the available options. This works only of the most important commands (like ls, rm, ...) but it's still a nice add-on. And best of all - it comes preinstalled with Ubuntu (on other distros you might need to install the bash-completion package with the corresponding package manager.

Thursday, April 05, 2007

Moving to Ubuntu - swap partition


I continued to perfect the solution for the Ubuntu swap partition problem (although I just upgrated to 1G of memory so it doesn't manifest itself as quickly as before, w00t!), and would like to share my results:

  • As posted earlier, you can use the free command to check if your swap partition is activated (on the last line where it says Swap you should see a non-zero value if it's active)
  • From ubuntuguide: you can use the sudo fdisk -l command to find your swap partition (you can identify it by the system type where it should say Linux swap / Solaris). Alternatively you can use gparted (not installed by default) to visually view your partition scheme.
  • If you want to go the UUID route, you can find out the UUID which corresponds to a certain partition by doing sudo vol_id -u device (in my case this would translate to sudo vol_id -u /dev/hda5) - thanks to this blog post
  • Now you have enough information to adjust your /etc/fstab file to point it to the corect swap partition. After restart, you should issue the free command again to make sure that your swap partition got mounted properly.

One reason for using the old /dev syntax would be that gparted seems to change the UUID of the swap partition when it moves it around (most recently I lost my swap partition when I was resizing the partitions on my machine).

Input validation


The month of PHP bugs is over, but you should still watch the PHP-Security blog, since there are good things coming from there, like this article: Holes in most preg_match() filters. Go read it if you are using regular expressions for input validation. Two tips to avoid these pitfalls:

  • Cast your input to the datatype you expect before validating
  • Use capture to get the values out which interest you rather than trying to validate the whole string (this also adds usability because it helps users if they included tabs / spaces at the beginning or end of the input - for example because they were copy-pasting it from a Word document)

Lies, Damn Lies and Statistics


I'm back with more critique for Deb Shinder (who for one reason or an other doesn't allow commenting on her blog, so I can't directly post there). Read part one (Biometrics is not the answer!) and part two (Three letter acronyms don't provide good security!) for more opinionated posts.

The post I'm talking about is Is Firefox less secure than IE 7?. First a little disclaimer: I may be biased in this matter (but who isn't) as someone who's been using and loving FireFox since version 0.9. The sentence I have the most issue with is the following: Firefox alone in recent months has had more exploits than Windows XP and Vista combined (yes, I should complain to George Ou for this one, and be sure that I will). People please try to limit ourselves to useful and meaningful information instead of trying to construct bogus and meaningless statistics to prove our points. If we have biases, lets come out and share them (like I did earlier) and lets try to compare apples to apples and oranges to oranges. This quote was insulting to the intellect of your readers (who are smart enough to realize that within MS there are different teams working on different products and they are so separated that you could almost call them a company withing a company). It is as if I would say that: IE had more vulnerabilities than there were full moons in 2006, so it is bad.

To finish up with an other statistic (again biased, but at least it is clear from the context): during 2006 Internet Explorer was vulnerable for 286 without a patch being available (78%) and Firefox for 9 (2.5%)

Full disclosure - yet again


I came about this post about ethical hacking and I felt the need to respond to it publicly since (I feel that) the article offers a skewed view and does not present the counter-arguments:

First of all I would like to stress that discovering and writing exploits for certain types of flaws (and I'm not referring to XSS :) ) does require serious knowledge and skills, which 99.9% of the programmers do not posses (and I'm saying this as a malware analyst who is doing reverse engineering as part of his daily job). While humility is a good, the fact of the matter is that these people are part of a select group. Also, as a sidenote: a large percentage of programmers (I don't want to guess, but certainly more than half) do not understand even the basics of the possible security risks which may affect their products (in case of a "web application" this may be sql injections or in case of binary product something like stack overflows).

Second of all, from an economics point of view: software vendors have no financial reason to fix bugs (and by bugs I mean here problems which wouldn't come up in real life - ie. they wouldn't bother the customers - but under exceptional circumstances - like a specially crafted query or input file - they could lead to information disclosure / arbitrary code execution / etc). Fixed bugs don't sell products. New features sell products. And most of the time the client isn't knowledgeable enough to asses how secure the product s/he buys really is. One might argue that competitors could disclose the bugs, but this rarely happens because the competing companies know that their product is equally buggy and if they disclose a bug in their competitors product, it will try (and most probably succeed) to find bugs in their product and the whole thing will come crashing down. In this sense "ethical hacking" and the threat of full disclosure plays a role of keeping the players (at least somewhat) honest.

Where the ethics part comes in (in my opinion) is thinking about the customer. As I see it there are two extremes:

  • the "bad guys" discover the vulnerability and use it to take advantage of the users of the certain product without anybody knowing it
  • the vendor discovers it and patches the problem (hopefully before anybody else discovers it)

Of course (as with everything) there are many shades of gray inbetween (like customers not deploying the patch right away and the "bad guys" reverse engineering it to find the flaw it fixes and then exploit it on the customer base who didn't apply the patch), but I didn't want to complicate this description.

The "ethical hacker" approach falls somewhere in the middle: after discovery let the vendor know and if it doesn't care (doesn't communicate with you, doesn't promise to release a fix within reasonable time-frame), release the vulnerability publicly, preferably with methods for potential customers to mitigate it. Why should it be released? Because as time passes, the probability that the "bad guys" find it increases! As an independent security researcher you don't have any other choice than to follow this path (because I don't think that very many companies will admit that they screwed up and bring you in to help them - because this would mean admitting failure which would result in many management types loosing their bonus packages which they don't want).

There are many bad apples in the "research community" who place personal pride before the interest of the customers, but they are not practicing ethical hacking! Examples:

However the example you cited does not apply. A big vendor can not disregard a serious vulnerability just because the style of the communication. Do you consider that just because I write "I'm the king of the world and you know s**** about software development" in an e-mail to MS in which I disclose a remotely exploitable flaw for Vista, they should disregard it? If the vulnerability is genuine and the vendor really doesn't communicate (doesn't even acknowledge the receiving of the mail) there is no other possibility than going public (again: preferably with a mitigation method for clients) - the alternative being to wait until the "bad guys" discover the vulnerability and the exploitation becomes widespread enough that the company is forced to do something about it. Here are some examples which you should consider:

  • Apple trying to discredit security researchers who found exploitable code in their wireless drivers
  • A person being arrested and persecuted because he discovered an information disclosure vulnerability in the website of a university and tried to notify them!
  • Amazon not fixing a bug for one year (!) which allowed arbitrary websites to manipulate your shopping cart if you were logged in to the Amazon website
  • Oracle not fixing vulnerabilities for years
  • Microsoft knowing about the recent .ANI vulnerability since December (by their own admission) and not releasing a patch sooner (exactly how many months does it take to test patch?!!) and when releasing it breaking software (the later of course can be the fault of poorly written software, but in this case it's not probable)
  • And finally a personal story: I discovered that using some CSS and JScript I can crash IE at my will. This was several years ago. I tried to notify MS several times and got back no response. The vulnerability persisted (I last verified it a couple of months ago that my fully patched XP box with IE 6.0 was vulnerable). Now I'm not going to disclose to anybody the code (because I migrated away from MS products), but after such long a period of time don't you think that I would be justified to do so?

You can't rely on companies to try to make the most secure products. They will make the products which generates the most revenue. Cars didn't have safety belts until they were forced to. The same way software vendors won't place security first (or at least in the first 3 positions) until they will be forced to.

Tuesday, April 03, 2007

Month of PHP bugs roundup


The month of PHP bugs is over and I thought that I make a little list with things you can do to mitigate the bugs where possible:

  • Update to PHP 5.2.1 and watch out for the next version and update to it as soon as it comes out. Do not PHP4, because there is a vulnerability which will not be fixed by the developers (because PHP4 is considered old code).
  • Install Suhosin (unfortunately it is currently only available for Linux)
  • If you have the Zend platform installed, take on look here to see if you are vulnerable to these exploits
  • Disable the following functions (there are some very common functions here, so unless you run your own server, you won't be able to generally disable them):
    • phpinfo()
    • substr_compare() - if you really need this function, you can find a replacement for it written in PHP on the documentation page (I didn't test it, but it looks like it should work).
    • mb_parse_str()
    • iptcembed() - already disabled if you disabled the GD extension
  • Disable the following extensions (they are rarely used, so in case you are a shared hosting, most probably you can get away with disabling them - of course if you host your own servers you should disable all the extensions which you don't use!):
    • WDDX
    • Ovrimos (in PECL, but you may have installed it with an older version of PHP)
    • The zip extension from PECL
    • bz2_filter
    • SQLite - the issues with it are fixed in PHP 5.2.1, however be sure to read the description here before relaxing (because you might use a different version than you think).
    • the GD extension - this is relatively widely used, so you can get away with disabling it only if you own you own server

Also my previous recomendations remain valid:

  • Run PHP as CGI rather than shared module
  • Configure your firewall rules strictly (if you don't have anything on port 4444, do not open port 4444!)
  • Consider using mod_security. While it is not perfect, it provides you with an added layer of security
  • If you are using a shared host account, consider moving to dedicated servers of VPS's! If you decided against it, consider it again! Think about this

Of course even you apply all these measures you won't be 100% safe, because some bugs remain unfixed and Esser hinted that he might be back later this month with an other month worth of PHP bugs.

Shared risk of shared runtimes


I love the interpreted languages. I love PHP, Perl, Java, C# and all the others. The liberty they give you is incredible! However there is a security aspect to them: because the actual machine code is shared by the programs written in one particular language, security features / products which depend on the executable image to uniquely identify the processes, fail on them.

Some security features which may be broken in some cases are: personal firewall (they get confused and allow / deny the wrong application) and files running under different user accounts (they may all end up running under a different account)

The method used by the runtimes can be divided in four categories:

  • A single runtime process shared by all the scripts - this is very rarely used, in fact I don't know of any widely used interpreted language using this approach (there is an experimental project for Java which loads all the files in a single). The advantage of such an approach would be (slightly) better loading speed and (slightly) smaller memory footprint (depending on how big the private structures allocated for an instance are), but from the security standpoind there would be many disadvantaged:
    • There would be no easy way how many processes are running inside the same VM
    • If one would to take the SHA1 hash of all the running executables (in a forensic examination for example), one would get the same hash for each running script (the one which corresponds to the interpreter), and even worse, one would get only one hash regardless of how many scripts are running inside the interpreter.
    • Security products which depend on the process to uniquely identify the code trying to do certain things (like a personal firewall filtering internet access) would be unable to distinguish between the different scripts and they would allow or deny the rights to all of the scripts.
    • Scripts would have to be run under the same user account (under the account the interpreter was first started). There are facilities in Windows for example for different threads of the same application to run under different user privileges but (a) it takes additional effort to implement them and (b) if there are facilities (functions or exploits) in the language which allow scripts to execute binary code directly, the script can jump thread and execute at the privilege level of an other thread. Also, there would be a potential for information disclosure vulnerabilities between different scripts.
    Luckily there are no widely used interpreters employing this approach.
  • A second method is to have multiple instances of the interpreter executable running, one for each script. This is a very widely employed technique (Perl, PHP, Ruby, Python, Java and many others use it). This is slightly better because processes can be easily run under different user accounts and the issue of information disclosure between scripts is resolved. Also, one could determine which script corresponds to which instance fairly easily by looking at the command line of the process (which can determined with tool like Process Explorer). However the problem with security products like personal firewalls or HIPS still remain, since from their point of view all instances of the interpreter are the same.
  • The third method is a variation of the second method. In this case there are small executable files corresponding to each script, which - when launched - load the interpreter - by loading a DLL or COM object for example - and then executing the script. This is the approach taken by the Adobe Apollo project for example. The method solves all but one of the problems enumerated above (the one which doesn't solve is the similar hash for running executables problem) and also is good from a usability standpoint (because it gives something to the user which is probably more familiar with - an executable file rather than a script file which needs an interpreter). There is some grey area between case two and three represented by things like script compilators which bundle the script and the interpreter in one executable or the small executables used to bootstrap programs like Eclipse, which in turn run the installed Java VM (so that you are back to case two).
  • Finally there is the solution which is the best from a security standpoint: the executable contains the code to be interpreted and it loads the interpreter in its own address space (as a DLL or COM object for example). This means that:
    • Different executables are different (and thus have different hashes - hopefully - unless you discovered a hash collision)
    • Programs which rely on the executable to determine what the process can / can't do will function correctly
    • The program can easily be run under different user accounts.
    This approach is taken by the .NET runtime and by the programs out there which produce executables from SWF (Flash) files for example.