When I started working with PHP years ago, the first thing I did was go online and try to find tutorials. I worked with someone who also did PHP (and was the primary driver for wanting to learn the language), but finding out how to do things on my own fit better into my schedule. Eventually I got to working with databases, and the tutorials I found looked like this:
1 2 |
$sql = "SELECT * FROM ".$tableName." WHERE id=".$_GET['id']; $result = mysql_query($sql); |
This is perfectly valid code. In fact, if you look at a lot of examples today you will see the same thing. There’s a huge flaw in these examples, and one that has given PHP a bad name when it comes to security. It isn’t the fault of the language but of the way that many developers learned how to do things. PHP makes it easy to shoot yourself in the foot and it doesn’t help that developers keep showing new developers how to do it. Get gun, point at foot, pull trigger. That’s essentially what the above example is showing you.
So, what is so wrong with the example code? The problem is that we aren’t validating anything. Do we know what $_GET['id']
actually is? What happens if $_GET['id']
is set to something that isn’t an integer?
1 2 3 4 |
// $_GET['id'] = 'bob' $sql = "SELECT * FROM users WHERE id=".$_GET['id']; // $sql is now equal to "SELECT * FROM users WHERE id=bob" $result = mysql_query($sql); |
When mysql_query()
runs it will return false. Depending on how well your code handles a false return, someone screwing around with the query variables will learn that we didn’t properly make sure that the id
query variable is being checked. At the very least they learned that changing query string parameters does cause the code to change. Depending on the setup of our code, it might even expose what database we’re using. If it’s MySQL, they can get crafty and do something like this:
1 2 3 4 |
// $_GET['id'] = '1 OR 1=1' $sql = "SELECT * FROM users WHERE id=".$_GET['id']; // $sql is now equal to "SELECT * FROM users WHERE id=1 OR 1=1" $result = mysql_query($sql); |
That little OR 1=1
bit will cause the SQL server to return all of the data in the user’s database. Let’s take this a step further, and say that we are authenticating a user. Most of the examples would look like this:
1 2 3 4 5 |
$username = $_POST['username']; // Using MD5 to hash passwords is bad, using ONLY as an example. Look into something like PasswordLib for a modern password solution $password = md5($_POST['password']); $sql = "SELECT * FROM users WHERE username='".$username."' AND password='".$password."'"; $result = mysql_query($sql); |
We’re doing one thing correctly (forcing this data to come through the POST array), but we’re still not really validating that $username
or $password
are actually good valid values. We can still do the following:
1 2 3 4 5 6 |
// $_POST['username'] = "admin' -- " $username = $_POST['username']; $password = md5($_POST['password']); $sql = "SELECT * FROM users WHERE username='".$username."' AND password='".$password."'"; // $sql is now SELECT * FROM users WHERE username='admin' -- ' AND password='oasdnifniuh3h48puahergler' $result = mysql_query($sql); |
We’re in, and as a user where we didn’t even know the password. Why? The --
will comment anything after it, so we dropped the AND password=
portion of the SQL and do the select based on just the username. It will return a single value just like the code will expect, so your code never knows that it was altered.
Both of these attacks are called “SQL Injection Attacks”, and is part of the OWASP Top 10’s #1 attack vector. The sad thing is that these are really easy to combat against. I see all the time on Reddit and Stack Overflow people asking for help and seeing this same sort of code. In 2013 we should have been able to stop this, but with years of old tutorials out there it is an uphill battle.
The reason that it sad to see this as still the #1 attack on OWASP is that it is easy to combat. The first and foremost is to stop using the mysql_*
functions in your code. The module that handles those is old, and it doesn’t support the best way to handle SQL injections. Yes, it has mysql_real_escape_string()
, but that relies on the developer remembering to use it. Better security comes from the developer having to not remember to do it and it just being built in.
Since PHP 5.0, PHP has shipped with something called PDO. PDO is a database abstraction layer that makes it easier to write cross-database code. If you write your app for MySQL, you can more easily port it to something like Postgres or MSSQL by doing nothing more than changing the connection parameter (assuming you haven’t written any engine specific code like using LIMIT
, which MySQL and Postgres support but MSSQL doesn’t).
PDO has another big advantage in that it has support for something called a Prepared Statement. With a prepared statement, the SQL query is parsed by the engine by taking a SQL statement and a list of parameters and putting them together itself instead of relying on a fully built SQL statement. This becomes much safer because the DB engine does all the work on properly quoting the values coming in. The side effect of this is that the worry about the developer being safe is offloaded somewhere else, so by using PDO and prepared statements the developer doesn’t need to worry about if a parameter has been sanitized.
Let’s go back to our first example, with the $_GET parameter. What would this look like with PDO?
1 2 3 4 5 6 7 |
$sql = "SELECT * FROM users WHERE id=:id"; // $pdo is our PDO connection $stmt = $pdo->prepare($sql); $stmt->execute(array( ":id" => $_GET['id'] )); $user = $stmt->fetch(); |
It’s a bit longer, but not so much that that there is a ton of extra work involved. What we did was create an SQL statement using placeholders, which in PDO can either be a ?
or named like our example, which was :id
. We run our SQL through the prepare()
method on our PDO connection, which returns a statement object. This statement object gets executed, and at the same time we pass in the values for our placeholders. We can then get the user by calling the fetch()
method on the statement.
We can rest assured that the SQL code is now safe from injections. That’s it! That is the single most effective way to combat SQL injections.
For older legacy applications switching out to PDO can be a major undertaking. Drupal did this in Drupal 7, where they replaced their underlying database abstraction layer to use PDO. Drupal didn’t use the database specific functions all over the codebase which made it somewhat easier to swap out, but if your application is riddle with calls to mysql_*
or pg_*
functions you will want to replace them all with PDO commands.
Do your part to help clean up the bad examples out there. If you ever do new code, use PDO. If you ever give examples, do it using PDO. Maybe we can make the internet a little better, and help knock SQL injections (and other injection types) off of the top of the OWASP Top 10.
Sign up today to have our latest posts delivered straight to your inbox.