2012/02/28

debugging like a boss: (My)SQL

(my)sql debugging is a breeze. first, write your application in a way that it uses a wrapper for all SQL queries. like this:
function mysql_query_wrapper($query, $link=NULL, $log='errorlog') {
 $res = ($link===NULL) ? mysql_query($query) : mysql_query($query,$link);
 if ($res===false) {
  $error = 'error in query: '.$query.', MySQL said: ';
  $error .= ($link===NULL) ? mysql_error() : mysql_error($link);
  if (($log=='errorlog')||($log='both')) error_log($error);
  if (($log=='display')||($log='both')) echo htmlspecialchars($error);
 }
 return($res);
}
if you start using it, you'll get useful information regarding which query died with what reason. it's rather simple, actually, and helps a lot. most of the errors will be missing statement parts (i sometimes tend to forget from `table` parts), or wrong where clauses.

also important is that right at the get-go, ie. connecting to your database, you check if the actual connection was successful, like this:
$conn = mysql_pconnect($host, $user, $pass)
 or die('no connex to the DB server');
mysql_select_db($my_db,$conn)
 or die('cannot connect to database '.$my_db);

when you have character encoding issues, always check the following:
  • the character set of the database
  • the character set of the table
  • the character set of the fields in question
  • the current connection character set (set by making a query like "SET CHARACTER SET UTF8" and "SET NAMES UTF8")
  • the encoding of your script file (should be UTF-8 in all cases!)
  • the Content-type and META output regarding the character set of the page you're trying to display; my best bet is to use header("Content-type: text/html; charset=UTF-8");
if there is a mismatch somewhere, you'll know where to look. (NB: mysql converts character sets and encoding on the fly, to whatever your connection character set/encoding is. to avoid performance issues, always use UTF-8 in both the database and your connection and your application.)

slightly trickier to debug is when your queries execute slowly. this may happen with very large data sets, or with joined tables. the first and foremost stop should be your trusty phpMyAdmin web application, as it offers a fantastic feature: profiling. how to use it is simple:
  1. choose the SQL tab on your table.
  2. write or paste the query.
  3. make the query.
  4. below "Showing rows..." and your query, there's a checkbox to the right with the label 'Profiling'. check it.
  5. now all subsequent SQL queries in your PMA session will have profiling information, where you can pinpoint which part of the query executes below par.
also, keep in mind the following:
  • optimize your table/field structure! usually, keep one particular datum in one table, in one field, unless you have performance issues, when you're encouraged to use caching fields -- but then be careful to update all cache fields when the datum changes!
  • if you need to access one particular row in a database, use a unique ID field in that table -- usually a field of an unsigned int or bigint type with auto_increment is the easiest solution, and use a unqiue index on that.
  • use indexes, but use indexes sparingly. if a particular field is often featured in your WHERE clauses, it's generally good practice to create an index or a fulltext index (if partial text matching is used) for that field. however, don't index all fields! if a table is regularly modified (inserted, updated or deleted), remaking the indexes can create a huge resource overhead.
  • if you have a field that can take up only a limited variety of values, use an enum() field type, even for boolean-type fields -- low storage requirement, and saves you a lot of trouble when you'd try to insert invalid values. also, it's good practice to add an index to such fields.
  • if you have a table that is written to as much as read from, consider using a fixed row length structure. when using the MyISAM engine, a table will have fixed row length when none of the fields is a varchar(), varbinary() or the text() and blob() types. sure, it can increase table size tremendously if you aren't careful with field lengths, but the seek/write performance is much better for fixed row length tables. (i regularly use them for user tables, for example.)
  • do not optimize tables on the fly, as it is a huge performance hit. i suggest using a separate admin area function or a cron-timed operation to optimize your tables regularly. and yes, you need to optimize them regularly to: a) keep your table growing and growing with deleted record overhead; b) to keep auto_increment fields sane.

No comments:

Post a Comment