I started putting together a few small PHP scripts to move data around. I’ve been using PEAR::DB a fair bit for this sort of thing, but I figured there wasn’t really much reason for it in this case since I knew these were only going to be talking to specific databases and wouldn’t be ported for anything else. This got me to wondering what sort of performance penalty I was paying for using PEAR::DB instead of the specific PHP functions. This lead to a few simple test scripts against a PostgreSQL database.
The first script uses the PHP PostgreSQL functions directly:
ini_set("display_errors", 1); $time_start = microtime(true); $db = pg_connect("host=dbserver dbname=testdb user=test password=testpw"); $runs = 250; for($i = 0; $i = '25-MAR-06'"; $rs = pg_query($db, $sql); while($row = pg_fetch_assoc($rs)) { } } $time_end = microtime(true); $time = $time_end - $time_start; print("TIME: {$time}n");
Nothing too exciting, just run the same query 250 times and ditch the results. This query happens to return 273 rows against my test database. I ran this script from the command line 10 times. The fastest time for this script was 3.6526100635529 seconds. Normally I’d include error checks in these, but I was only interested in how fast I could get through data. The second script made use of PEAR::DB:
ini_set("display_errors", 1); require_once("PEAR.php"); require_once("DB.php"); $time_start = microtime(true); $db = DB::Connect("pgsql://test:testpw@dbserver/testdb"); $runs = 250; for($i = 0; $i = '25-MAR-06'"; $rs = $db->query($sql); while($row = $rs->fetchRow()) { } } $time_end = microtime(true); $time = $time_end - $time_start; print("TIME: {$time}n");
Same conditions, run 10 times from the command line with the same 273 rows returned from the query. The fastest time was 6.5583028793335 seconds. So PEAR::DB added an additional 2.90569281578 seconds to the process, an increase of almost 80%.
Just add more data to the mix I put together another script using ADOdb:
ini_set("display_errors", 1); require_once("./adodb/adodb.inc.php"); $time_start = microtime(true); $ADODB_FETCH_MODE = ADODB_FETCH_ASSOC; $db = NewADOConnection("postgres://test:testpw@dbserver/testdb"); $runs = 250; for($i = 0; $i = '25-MAR-06'"; $rs = $db->Execute($sql); while(!$rs->EOF) { $rs->MoveNext(); } } $time_end = microtime(true); $time = $time_end - $time_start; print("TIME: {$time}n");
Once again 10 runs from the command line with 273 rows being returned from the query. The fastest time was 5.3411331176758 seconds. Faster than PEAR::DB by more than a second, but still well behind the script that used the direct database functions by 1.68852305412 seconds. This is about 46% slower compared to PEAR::DB’s almost 80%.
I should note that I tried to include PEAR::MDB2 in the test but it kept giving me reference errors. I modified my test script to deal with a couple of them, but gave up after more of them showed up.
What does all this mean? Here is what I take from it:
- If you need performance over supporting multiple databases and use of PEAR packages that require PEAR::DB (like PEAR::DB_DataObject) then stick with calling the database functions directly. No way to beat that from a performance perspective.
- If you need to support multiple databases and don’t have to worry about other PEAR packages then go with ADOdb. It is faster than PEAR:DB good margin.
- If you are using other PEAR packages that require PEAR::DB then go with that.
This wasn’t intended to be an exhaustive test, I just wanted to get a feeling for the relative speed of these three options. For reference I used PHP 5.1.2 on a FreeBSD 6 system. If you run a similar comparison be sure to look at the relative differences between the times, not the times themselves.
UPDATE Tue 4 Apr 2006 @ 4:30pm : Philip Hofstetter pointed out that I should have included the new PDO functions in my test. So here it is:
ini_set("display_errors", 1); $time_start = microtime(true); $db = new PDO("pgsql:host=dbserver dbname=testdb user=test password=testpw"); $runs = 250; for($i = 0; $i = '25-MAR-06'"; foreach($db->query($sql) as $row) { } } $time_end = microtime(true); $time = $time_end - $time_start; print("TIME: {$time}n");
Once again 10 runs from the command line, with the query returning 273 rows. The fastest time was 4.287976026535 seconds. So PDO comes in at number two, beating ADOdb by more than a second. But still behind direct function calls by more than six tenths of a second, adding only 17% over head. PDO will only work with PHP 5, so if you want/need your app to work with PHP 4 then don’t even include PDO on your list of options.
UPDATE Tue 11 Apr 2006 @ 12:10am : Deane over at Gadgetopia.com wanted to know how the ADOdb extension faired in this test. So here we are one more time ๐
After installing the ADOdb extension I ran the same exact test script for ADOdb 10 times from the command line with the same query returning 273 rows. The unmodified script returned times that were very close to the numbers with ADOdb without the extension. The fastest time was 5.3139600753784 seconds, only 0.027173 seconds faster than plain ADOdb. To test out the speed of the extension I made one change to the test script, using adodb_movenext($rs);
instead of $rs->MoveNext();
in the while loop:
ini_set("display_errors", 1); require_once("./adodb/adodb.inc.php"); $time_start = microtime(true); $ADODB_FETCH_MODE = ADODB_FETCH_ASSOC; $db = NewADOConnection("postgres://test:testpw@dbserver/testdb"); $runs = 250; for($i = 0; $i = '25-MAR-06'"; $rs = $db->Execute($sql); while(!$rs->EOF) { # $rs->MoveNext(); adodb_movenext($rs); } } $time_end = microtime(true); $time = $time_end - $time_start; print("TIME: {$time}n");
Another 10 runs from the command line and the fastest time was 4.2841749191284 seconds. This is 1.0569582 seconds faster than ADOdb without the extension. This is ever so slightly faster than PDO, beating it out by 0.0038011. Calling the direct database functions are still 0.6315648 seconds faster than ADOdb + ADOdb-extension.
To make things easier here is a table of the results. Remember that the focus is on the relative differences, not the specific times, which will vary with different hardware/OS/etc. All times are given in seconds.
Library/Function | Time | Absolute Difference |
---|---|---|
Direct database functions | 3.6526100635529 | 0 |
ADOdb + ADOdb/extension | 4.2841749191284 | 0.6315648 |
PDO | 4.287976026535 | 0.6353659 |
ADOdb (w/o extension) | 5.3411331176758 | 1.688523 |
PEAR::DB | 6.5583028793335 | 2.9056928 |
Using the specific database functions still wins, I suspect they always will. Interesting that PDO didn’t beat out ADOdb + ADOdb/extension, although the times are so close that it is probably a wash.
29 replies on “PHP Database Functions vs. PEAR::DB vs. ADOdb (and PDO)”
you really should also run that test with an opcode cache like ionCube or the turkMM as (at least with adodb) its significantly faster.
If I were running for a website I would, but as I stated in the post I ran all of these tests from the command line. An opcode cache doesn’t do anything for your when running scripts from the command line interface (CLI).
Hi Joseph,
it looks like we both seem to be having out benchmark-day today (thanks for your comments on my blog, btw).
What would really interest me here is PDO’s performance. I guess a lot of performance is lost due to both benchmarked libraries being written in PHP. Maybe native PDO is infact as speedy (or nearly as speedy) as the pure PostgreSQL API is.
If you have the opportuniy, maybe go ahead and test it with PDO.
Philip
Completely forgot about PDO. I’ll add that one to the post.
yeah i missed where you said the command line, my penalty for skimming. so are you running this as a bash script or just calling the script like php myscript.php? i was not aware that if you ran a script ie; php run_me.php that it would bypass any opcode cache you have. i am surprised that it would do that, since an opcode cache is used to keep php from going through the whole tokening compiling bit over and over again, i don’t think thats very optimal.
Any links to docs on this would be super helpful, thanks!
Kenrick-
Opcode caches are able to work because they are part of a persistent process, usually Apache with mod_php. Because the Apache process doesn’t exit between requests the opcode cache is able to keep things around. I haven’t confirmed this, but I suppose it would be possble to use an opcode cache that used files instead of memory. I’m not sure that you’d see any sort of gain from that though. I’ll note that the system that I did this on is using APC for opcode cache, but like I said, I don’t think that command line apps see any benefit from it.
I ran these scripts using the PHP binary, like: ‘php ./time-adodb.php’. There is a whole section on using PHP from the command line in the docs:
Chapter 43. Using PHP from the command line
Yep you’re right about the opcode not working from command line. I contacted Nick from ionCube and he confirmed that it doesn’t work.
This little test you did has become really helpful for me this week. I run some scripts from the command line for some cron jobs, and I know now I could make them faster by just using the php functions instead of adodb. Thanks.
What were these reference issues in MDB2?
There are some benchmarks in the following two threads:
http://marc.theaimsgroup.com/?l=pear-dev&m=108237909630724&w=2
http://sourceforge.net/mailarchive/message.php?msg_id=10269486
[…] May 23rd, 2006 in Links PHP Database Functions vs. PEAR::DB vs. ADOdb (and PDO). […]
[…] Joseph Scottโs Blog ยป Blog Archive ยป PHP Database Functions vs. PEAR::DB vs. ADOdb (and PDO) comparing database function performance (tags: php) […]
Well actually: the results wouldn’t be affected by an op-code cacher: the timing starts *after* the includes for the dbaselayers ๐
I wonder how the tests would be affected if you used PREPARE/EXECUTE (prepare only once, then execute inside the loop) instead of using QUERY inside the loop. My guess is that the native PHP functions would pull away even more, leaving the libraries in the dust.
Also, IIRC ADOdb does “fake” prepares (it just does some string parsing for you) while PEAR does real prepares (can anyone confirm this?). If this is true, PEAR might get a boost relative to ADOdb when using the PREPARE/EXECUTE method.
The curious cat wonders what the same benchmarks look like with ADOdbLite (http://adodblite.sourceforge.net) …
I would be very interested in seeing the construction and destruction of connections factored in. Very rarely do I connect, do 300 operations, and disconnect. a connect, do 5-6 operations, disconnect, make new object approach might give more interesting numbers. ๐ Also, there’s no escaping of variables in native library calls, so you might want to add that in as well to the native calls vs. using bind variables in ADODB/PDO/etc
Cheers,
Brian
Read this article please:
http://www.ibm.com/developerworks/library/os-php-dbmistake/index.html
@Habes –
I’m not sure what your point it is, that IBM article has nothing to do with comparing the speed of accessing a database from PHP using different methods.
The IBM article does point out some interesting issues, just not the same ones that are mentioned here.
nice read joseph, i’ve actually developed a CRM in php&adodb for mySQL, after running running into a problem i jumped on IRC to see if i could get some help. after being insulted for choosing adodb by #php on 3 servers(even being insulted by an eggbot) i ended up figuring it out solo. Why are .php against adodb, even tho it clearly performs well concidering it’s capabilities to function over multi databases?
@Nerdboy1024 –
Some projects have avoided it because the distribution of adodb is rather large, and would increase the size of their own project. Also, for projects that only want to focus on one database (like MySQL), then the overhead of adodb doesn’t give them anything extra.
Those would be my guesses.
Hi Joseph,
Thanks for the benchmark. However, it does not take into consideration the real world aspect of library inclusion overhead.
See: http://www.tonylandis.com/articles/accurate-web-application-benchmarking-methodology.htm
@Tony Landis –
True, it does not include the over head required to import the library code. For those worried about performance that over head is likely to be very, very, very small since they’ll be using some sort of cache like APC.
You technique also includes many other factors that I wasn’t interested in measuring, my only concern was the relative performance of talking do the database in different ways.
I read through the results of your tests, I didn’t see anything using the ADOdb extension, which speeds things up quite a bit.
Hi Joseph,
I agree that the ADOdb extension would change things. I suspect that PDO and ADOdb performance would be nearly a tie.
I do have to disagree that the overhead to import code is small – as my tests showed, even with APC enabled, ADODdb performed at a bit less than half the speed of ADOdb. And since many hosting environments offer neither APC or the ADOdb extension, ADOdb is going to use a great deal more resources in most cases… so it should be a concern where performance is an issue.
Ultimately, my tests were to determine if there is it was an advantage or a disadvantage to add a large distribution such as the ADOdb library, and I understand that your tests were targeted differently.
@Tony Landis (21) –
If you suspect that they would be nearly a tie, why leave it out? And if you are worried about performance, why even consider a host where you can’t use something like APC or the ADOdb extension? Usually this means a dedicated server, something where you have a lot more control.
You tests showed ADOdb with APC improving by more than 700%, on the low end. A good chunk of that is because of APC, since it doesn’t have to re-compile the PHP scripts every time. Which goes back to my point, using something APC offsets library size in a big way. Not entirely, but much better than without it.
It seems obvious on the surface that ADOdb is probably not going to be as fast as PDO, I don’t think anyone will find that surprising.
If you were looking to demonstrate performance issues, why didn’t you include benchmarks for the mysql/mysqli functions? Wouldn’t those be even faster than PDO?
Just a note on “performance” in general. It isn’t a substitute for scalability. Not that improved performance isn’t an important part of being scalable, but no amount of increased performance will make up for scalability.
Joseph,
I left out ADOdb with the extension because, as I stated, my tests were to determine if there is it was an advantage or a disadvantage to add a large distribution such as the ADOdb library. Also, it is my feeling that PDO is widely adopted (and will become even more so) by web hosts than the ADOdb extension.
As to why I didn’t benchmark the mysql/mysqli functions – I was benchmarking database abstraction libraries. Of course they should perform faster than any abstraction library but that is beside the point if abstraction is required, right?
Are you suggesting that using any of these libraries are less suitable for use in a scalable application?
Found this on google, and I wanted to update the stats a bit:
PDO, 2500 iterations: 2.94587302208
pg_query, 2500 iterations: 2.72805809975
So they’ve improved PDO in the last year or two.
It’s ~8% slower than direct pg_* functions, at least for relatively simple selects, as opposed to the ~16% when this post was originally written.
1. [dumped PDO driver]: 3.225413084 sec
2. [dumped DB driver]: 6.618566036 sec
3. [dumped MDB2 driver]: 4.775002956 sec
4. [dumped mysql driver]: 3.155264854 sec
‘mysql’,
‘hostspec’ => ‘localhost’,
‘database’ => ‘mysql’,
‘username’ => ‘root’,
‘password’ => ‘pass’
);
if ($type == “DB”)
{
require_once(“DB.php”);
$db = DB::connect($dsn);
if (PEAR::isError($db)) { die($db->getMessage()); }
$db->setFetchMode(DB_FETCHMODE_ASSOC);
for ($i=1; $iquery( $sql);
if (PEAR::isError($res)) {die($res->getMessage());}
echo “”;
while( $res->fetchInto( $row ) )
{
process_row($row);
}
echo “”;
}
}
elseif($type == “MDB2”)
{
require_once(“MDB2.php”);
$mdb2 =& MDB2::connect($dsn);
if (PEAR::isError($mdb2)) { die($mdb2->getMessage()); }
for ($i=1; $iquery( $sql);
if (PEAR::isError($res)) {die($res->getMessage());}
echo “”;
while ($row = $res->fetchRow(MDB2_FETCHMODE_ASSOC))
{
process_row($row);
}
echo “”;
}
}
elseif ($type == “PDO”)
{
$pdo = new PDO(“mysql:dbname=”.$dsn[‘database’], $dsn[‘username’], $dsn[‘password’]); //PDO::FETCH_ASSOC
for ($i=1; $i<=$num; $i++)
{
echo “”;
foreach ($pdo->query($sql) as $row)
{
process_row($row);
}
echo “”;
}
}
elseif ($type == “mysql”)
{
mysql_connect($dsn[‘hostspec’], $dsn[‘username’], $dsn[‘password’]);
mysql_select_db($dsn[‘database’]);
for ($i=1; $i<=$num; $i++)
{
echo “”;
if ($res = mysql_query($sql))
{
while($row = mysql_fetch_assoc($res))
{
process_row($row);
}
}
echo “”;
}
}
}
function process_row($row)
{
ob_start();
echo sizeof($row);
//print_r();
ob_end_clean();
}
$types = array(“PDO”, “DB”, “MDB2”, “mysql”);
$sql = ‘SELECT * FROM user’;
require($_SERVER[‘DOCUMENT_ROOT’].”/inc/class/class_timer.php”);
$timer = Timer::getInstance();
$timer->saveTime(‘start’);
foreach ($types as $type)
{
dump_users($type, $sql, 500);
$timer->saveTime(‘dumped ‘.$type.’ driver’);
}
echo $timer->getReport(1);
?>
Well the question now is: What about dbx and oDBC access performances with this bench ?
Surely that ODBC will be one of the slowest but I really don’t know for dbx !!!
I think that as it is based on vendors drivers it should be slower than using it directly, but will it be faster than ODBC, DB, & MDB2 ???
It’s been a long time since I put together that benchmark. At this point I don’t have any plans to add to it.
My own data shows
pgsql TIME: 1.2292530536652
adodb TIME: 3.7784049510956
pear TIME: 6.3551380634308
pdo TIME: 1.2426941394806
from selecting from 1521 rows over 250times from the DB in my laptop.
Hi!
Nice to have a comparison of speeds. Howewer, It would be even nicer with small charts… the query you use deals with around 200 records. It would be VERY interesting to know how the different libraries behave when dealing with 1000, 10000, 100000. This would allow you to draw very simple charts and have an idea of how the different systems deal with complexity…
Another interesting statistic would be: what is the amount of memory used by these systems?
Thank you for your work anyways ๐
Regards