wrangling the code

2012/02/28

debugging like a boss: AS3 (Flash CSx and Flex)

debugging techniques in AS3 are quite similar to PHP's techniques, but there are some key differences:

unless you use one of the debugger players, you're in tough luck -- you'll get no error messages, your script will just break and do nothing. so get one. if you use the debugging player plugin, choose a browser you don't use primarily -- because with debugging enabled, you'll be very surprised by the number of buggy flash movies on the net. and you don't want to have your flash game experience interrupted by constant error messages, do you?
while console output is possible, even in the plugin versions, it's a hassle to get it going. you'd be much better off using the JS console or the alter popup of your browser to output errors (through using ExternalInterface calls) or native flash alert windows.
AS3 is very picky, and can die even on the slightest mistake. and not even Flex Builder helps in a lot of instances.
flash movies are not linear in execution, rather they are event-driven. that makes simple rundown debugging impossible, but you can still use the watch point techniques as described with PHP.
sandboxes are very important in AS3; please check adobe's documentation regarding this issue. a lot of broken scripts actually just try to step outside their security sandboxes, and get zapped in exchange.
since it's flash, you can't write a debug log file anywhere. sorry. all you have is the console and the builtin alerts and external JS calls.

some tips:

if you are developing in Flash CSx, use the console logging option, as the IDE will show you that.
if you are developing in Flex, use either mx.controls.Alert.show('your debug msg'); to pop up your own debug messages, or use an ExternalInterface call and either log to the browser's console or use its alert popups.
if you are developing in Flex, always use the swfs in the bin-debug folder, because a release-exported swf will throw error messages, however without any useful info regarding files or line numbers. however, once your app is ready, always do a release export, because you a) don't want your files to be twice as big as necessary (yup, lots of debugging information inside!), and b) you don't want to expose your sources to the world...
use the try..catch construct in AS3 as well, sometimes it's the only way to avoid your script breaking down on some miniscule error (all errors, regardless of their severity, stop the flash movie's main thread dead in its tracks).
you can use the /* ... */ commenting out and then moving bit-by-bit technique as i've shown you for PHP here as well; but never forget to recompile your movie by each bit.
if you use the standalone player (projector), you must assume the worst security sandbox, and you won't have access to the ExternalInterface methods. be prepared for that.
always remember you're developing in an event-driven environment.

debugging like a boss: PHP

PHP is fairly easy to debug. for starters, there are PHP's own error messages (turn display_errors on! even on a production server! if there is an error, you should either write a handler for them, and if it's an unexpected error, the sooner you get wind of it is the better). also, there is an error log for PHP, if you have access to it. furthermore, if you're working in an IDE like Eclipse-PHP/PD, or NetBeans, you probably will have access to Zend's xdebugger, but that's not that easy to set up and use...

but what happens when there are no error messages, and all you get is a blank screen? or what happens where there is no error message, but something runs broken in your app?

fear not, the solutions are quite easy.

first, there are three major ways how you can show usable debug data:

the screen (the browser):

if you're positive that your app gets to the state that it at least displays something, you can use the HTML output as a debug screen. you can go about it in the following ways:

display debug information as-is, with an echo -- a bad idea, only suitable for putting watch points in your script (see below).
put your debug data in a <pre> tag, using htmlspecialchars() for encoding > and < and these sorts of characters: better, as you don't have to care about putting <br/> elements in your output to make sense.
put your debug data in html remarks (begin with , and take care not to have the value --> as a vanilla output lest you prematurely end the remark): almost the best, as it allows you to debug in-production sites as well without messing up the output for regular users.
there is one caveat with the remark output: chrome. since HTML remarks are only visible in source display mode, and given chrome's frustrating solution to give you source view only with reloading the page, if you choose the remarks method, make sure you use firefox, opera or... gah... even IE.
use console.log() javascript calls in your HTML output to log to your browser's javascript console. not available with vanilla IE.

using the screen is the best method if you use virtual watch points -- simple echo statements that output where you are in your script. an example:

// ... where you want to watch something
echo "before starting cycle";
for($i=0;$i<1;--$i) {
 echo "executing iteration {$i}...";
 //... your stuff ...
 echo "iteration {$i} run...";
}
echo "after cycle";

it's the simplest way of keeping track where you are in your script. (and, by using this, you just detected an infinite loop... whoops! :) ) if you have no error messages (meaning that syntactically and programmatically your code is correct), yet you don't get the expected results, you can pinpoint the location of your problem. and as you can see in this example, you can even insert runtime values in your watch messages so you can see current stati of your system.

the error log

if you have access to your web server's error log, you can log debug messages with error_log($message); to your primary error log. you can even output watch point messages like that! the upside is that you will see both your messages and errors by the PHP interpreter in one continous stream. the downsides are: a) you may not be able to read your error log, depending on your server/hosting setup; b) a LOT of errors might show up there, making your job quite difficult to discern what you really need; c) if you want to output large amounts of debug data, error logging is not the best solution for that.

debug files

ever since writing files is so simplified with file_put_contents(), this can be the easiest way of creating debug data.

writing debug data has its own advantages and disadvantages:

con: native PHP errors don't show up here, so unless you do some error catching by yourself, you'll have to match debug data with displayed/error-logged errors.
pro: your debug file will only contain data you specifically want to debug.
pro: you can put in as much data as you want to.
con: you have to take care of data stamping yourself.
con: if you don't have write permissions where you want to put your debug log files, that is in itself another bug you'll have to catch. before using this method, it's good practice to test if you can write a file where you want to put your debug files. if not, change the permissions or ask your sysadmin what to do about it.
pro: you're not tied to single-line logging like in error_log, and you don't have to prepare your data for HTML-compatible output like with the display methods.

if your script is a backend for an AJAX request, especially if it's one that is supposed to yield JSON or XML output (that is, not native HTML/JavaScript), using the error log and debug files might be your only choice.

what should you output for debugging, and how?

this depends on your needs and the application.

if the application is at a production server, ie. many people use it, make sure you don't disclose sensitive information readily readable on the screen, and you must use the HTML comment/remark technique to hide any such things from regular visitors.

you can safely output your virtual watch point messages as long as they don't contain sensitive information and don't confuse the user. however, if you want to display data sets and structures, simple echos won't do the trick for you.

as for how to create debug output:

consider the following code.

define('MY_DEBUG_ACTIVE',1);
define('MY_DEBUG_FILE',dirname(__FILE__).'/debug.log');

function debugmsg($message,$level=1) {
 if ($level<=MY_DEBUG_ACTIVE) {
  $msg = date('Y-m-d H:i:s').': '.$message."\n";
  file_put_contents(MY_DEBUG_FILE,$msg,FILE_APPEND);
 }
}

//... your code... watchpoint comes...
debugmsg("before starting up db");
mysql_connect();
debugmsg("sql started up");
//... etc.

if you use something like this, and put in a lot of debugmsg() statements in your code, you can thoroughly check your code's run. and, if it is no longer necessary to debug, you just change the MY_DEBUG_ACTIVE constant to 0. of course, this is not the end of all things: you can create a system which distinguishes between debug levels (usually three levels should suffice: 0-nothing, 1-general debug, 2-detailed debug), and call debugmsg with either 1 or 2 as the debug level. it is prudent to leave this debugging code in production-level scripts as well, because you may never know when you need to use it again, and the performance overhead is minimal when debugging is turned off.

now, what to put in the debug output?

first, watch point information is good, because you'll get a sense where you are in your script, so if debug output suddenly ceases, you'll know where the script breaks in a major way.

second, when you deal with incoming data either from user input or otherwise, you should debug some or all such data as well. when you want to debug an array or object structure in its entirety, you can create a humanly readable form with var_export($data_to_debug,TRUE); do not use print_r or var_dump (as it displays output right into the output buffer, so unless you do some buffering of your own, you'll likely break the output of your script and your debug log stays empty). besides this, you can always put the variables of your choice in the $message string, but take care to identify which variable's value you are putting there.

if you want to catch exceptions, you have to use PHP's try..catch construct, so error messages don't end up on the display or in the error log, but rather you'll have your own chance to jot them down in your debug output. please see PHP's documentation about this. it's also a safer way to handle errors as such errors don't just go and ruin your output, but give you the opportunity to handle them gracefully. mind that PHP's exception model gives you the possibility to backtrace the error, yielding the complete call stack (basically the route your application took until it ran into the exception), which is immensely useful tracing bugs.

lastly, if you want to have your output included in the debug log, you should read about PHP's output buffering solutions, because with that you can catch the normal output of your application and have it logged, and then send it to the client afterward. (however, in timed operations, like an AJAX backend that provides timely output, that solution does not work.)

a few more tips regarding PHP debugging:

the worst is when your script dies without output, and not even debug output is done. in that case, use the comment-out technique:

put in a simple echo "hello world"; line as the first actual line of PHP code in your script, and use the /* ... */ comment type to comment out the rest of your script. try to run it, if not even "hello world" appears in your browser, you have a problem with your PHP setup, so go and correct that (or have it corrected by the sysadmin).
function definition by function definition, class by class, go one by one and move the beginning /* comment mark lower and lower in your code, and try running your script in each step. if you don't have functions or classes, move 10 lines down each time. don't mind if you run into (even fatal) errors, that's just a sign of PHP's interpreter working properly.
when after one of the steps the script once again dies without any output, move back to the previous point where your script still worked to some extend, and then go line by line.
this way, you can isolate the offensive line that killed even PHP's interpreter! :)

another tip: use an IDE (Eclipse-PHP/PDP, NetBeans), because they can help catch syntactical errors right in the editing phase, saving you a lot of hassle afterwards -- but never forget, they're good, but they're not omnipotent! sometimes they fail to catch some errors, and you'll have to use common sense and the debugging tools above to identify your problems.

typical errors and tips:

when using the heredoc syntax ($something = <<<EOT .... EOT;), use it with care. here's a good guide how. you must remember that when you put the EOT; part in your script, there should be no whitespace or anything else after the ; and before the new line marker; otherwise, be prepared for very misleading error messages.
when using the == operator to compare things, take care not to write a single = sign, because that's assignment, not a comparison! some sources say the best practice is to use a reverse notation (like 'some string'==$variable) because if you omit the two = signs it will fail with an error, but personally, i find that very confusing. :)
make sure you know what assigning by value and by reference are, and you know the distinction between the two; and always remember, object are always assigned by reference.
be careful how you interpret a result from a function/method call; if the method can return a boolean or NULL value for an error/mismatch, and 0 as a valid result, always use the strict comparison operators (=== and !==) to specifically check for NULL and boolean values.
using undefined indexes and variables throws only an E_NOTICE, but you should watch out for those as well (told you, E_ALL is what you need to write really clean code!). don't just assume they're defined -- check with isset() if they are, before doing anything with them (even comparisons!).
never trust external input! never! always assume there's a hacker or a script kiddie somewhere out there who'd try and break your application. always check incoming variables for validity, and never, EVER put such values unescaped in: SQL queries, system/exec commands, or eval() statements! NEVER! use mysql_real_escape_string(), preg_quote(), addslashes(), escapeshellarg() or escapeshellcmd() to make sure nothing harmful enters your system.
try to avoid eval() as much as possible.
if you gotten comfortable with ereg, eregi, ereg_replace, eregi_replace... well, tough luck. i have (since i've been programming in PHP for 10+ years), but since its UTF8 support is clunky, and the whole library is now obsolete, create your new code with PCRE counterparts, and if you have the time, go back and fix your old code to use that, too. PCRE is faster, more compliant, and what's most important, it won't throw you a zillion E_DEPRECATED messages.
if you are developing a lot of classes, you have the UnitTest framework to test your classes and its methods. i'll cover that later (much later...).

debugging like a boss: (My)SQL

(my)sql debugging is a breeze. first, write your application in a way that it uses a wrapper for all SQL queries. like this:

function mysql_query_wrapper($query, $link=NULL, $log='errorlog') {
 $res = ($link===NULL) ? mysql_query($query) : mysql_query($query,$link);
 if ($res===false) {
  $error = 'error in query: '.$query.', MySQL said: ';
  $error .= ($link===NULL) ? mysql_error() : mysql_error($link);
  if (($log=='errorlog')||($log='both')) error_log($error);
  if (($log=='display')||($log='both')) echo htmlspecialchars($error);
 }
 return($res);
}

if you start using it, you'll get useful information regarding which query died with what reason. it's rather simple, actually, and helps a lot. most of the errors will be missing statement parts (i sometimes tend to forget from `table` parts), or wrong where clauses.

also important is that right at the get-go, ie. connecting to your database, you check if the actual connection was successful, like this:

$conn = mysql_pconnect($host, $user, $pass)
 or die('no connex to the DB server');
mysql_select_db($my_db,$conn)
 or die('cannot connect to database '.$my_db);

when you have character encoding issues, always check the following:

the character set of the database
the character set of the table
the character set of the fields in question
the current connection character set (set by making a query like "SET CHARACTER SET UTF8" and "SET NAMES UTF8")
the encoding of your script file (should be UTF-8 in all cases!)
the Content-type and META output regarding the character set of the page you're trying to display; my best bet is to use header("Content-type: text/html; charset=UTF-8");

if there is a mismatch somewhere, you'll know where to look. (NB: mysql converts character sets and encoding on the fly, to whatever your connection character set/encoding is. to avoid performance issues, always use UTF-8 in both the database and your connection and your application.)

slightly trickier to debug is when your queries execute slowly. this may happen with very large data sets, or with joined tables. the first and foremost stop should be your trusty phpMyAdmin web application, as it offers a fantastic feature: profiling. how to use it is simple:

choose the SQL tab on your table.
write or paste the query.
make the query.
below "Showing rows..." and your query, there's a checkbox to the right with the label 'Profiling'. check it.
now all subsequent SQL queries in your PMA session will have profiling information, where you can pinpoint which part of the query executes below par.

also, keep in mind the following:

optimize your table/field structure! usually, keep one particular datum in one table, in one field, unless you have performance issues, when you're encouraged to use caching fields -- but then be careful to update all cache fields when the datum changes!
if you need to access one particular row in a database, use a unique ID field in that table -- usually a field of an unsigned int or bigint type with auto_increment is the easiest solution, and use a unqiue index on that.
use indexes, but use indexes sparingly. if a particular field is often featured in your WHERE clauses, it's generally good practice to create an index or a fulltext index (if partial text matching is used) for that field. however, don't index all fields! if a table is regularly modified (inserted, updated or deleted), remaking the indexes can create a huge resource overhead.
if you have a field that can take up only a limited variety of values, use an enum() field type, even for boolean-type fields -- low storage requirement, and saves you a lot of trouble when you'd try to insert invalid values. also, it's good practice to add an index to such fields.
if you have a table that is written to as much as read from, consider using a fixed row length structure. when using the MyISAM engine, a table will have fixed row length when none of the fields is a varchar(), varbinary() or the text() and blob() types. sure, it can increase table size tremendously if you aren't careful with field lengths, but the seek/write performance is much better for fixed row length tables. (i regularly use them for user tables, for example.)
do not optimize tables on the fly, as it is a huge performance hit. i suggest using a separate admin area function or a cron-timed operation to optimize your tables regularly. and yes, you need to optimize them regularly to: a) keep your table growing and growing with deleted record overhead; b) to keep auto_increment fields sane.

debugging like a boss: JavaScript, jquery, prototype, AJAX

on non-retarded browsers (ie. IE is retarded in its vanilla state), you have the script console, that should give you at least some valuable information about the bugs encountered, also, they offer console.log() as an option to let you write debug messages to the javascript console. firebug and chrome's developer tools also provide breakpoint options, and what's more, watch expressions. i use the latter a lot to inspect that states of JS objects i created, and i suggest you do the same.

the most confusing errors are the ones that seem to originate from jquery or prototype; in this case, you should examine the bug's call stack to see where your script called either framework that caused the glitch. most of the time it's about nonexistant elements you're referring to, or events which you have an invalid handler for.

another subspecies of errors is AJAX calls. things can get difficult there. first up, you have to write your AJAX call handling in a way that it intercepts network errors or server error responses, so if there's a legitimate reason why your AJAX request failed, you can handle it gracefully.

that done, if your AJAX output is HTML code interspersed with JS, you're in the clear as you can put in HTML tags containing debug information from the PHP source, or even console.log() calls in JS.

however, if you output JSON or XML as your result, you should write your AJAX processing in a way that it can interpret certain properties in JSON or certain nodes in XML that should go to a debug output in your AJAX application; the debug output can be alert() calls (which can be a nuisance and are not fit for displaying large amounts of debug information), element write-ins or console.log() calls. also, using watch expressions, you can see the actual contents of your JSON/XML information.

as for the PHP end of things, i'll describe them in the PHP section.

debugging like a boss: CSS

CSS can be a pain in the ass to debug, but thankfully, you have firebug on firefox, and you have the development tools in chrome. i use chrome, so i use development tools, which gives you a way to inspect any and all elements on a page, even dynamically created, and among the properties you can see the actual CSS inheritance chain used for that particular element. using it makes it a breeze to find and debug CSS bugs, as you can see which definition is in effect.

le bugs, le bugs! debugging like a boss

okay, writing code is sometimes easy, sometimes hard, but it's walk in the park most of the time, when compared to bug hunting.

PHP and JavaScript are, usually, easy to debug. after all, if display_errors is turned on in your PHP configuration, you get first-hand notices of anything that went wrong (and i suggest setting it so that it displays ALL warnings, notices, deprecated uses and so on -- you need to write code that is flawless and up-to-date, regardless of anything anyone may say). for JavaScript, you have the javascript error console in most vanilla browsers (except for IE, where JS error display sometimes may be... less than informative; but IE is a bastard in many other regards as well).

trouble comes when there is no error display and your application just doesn't work.

of course, if you do development in a fairly complex IDE, you should have debugging tools at your fingertips, but setting them up and using them may sometimes be clunky and confusing. so, instead, let me share you my tips for debugging.

and for starters: debugging is:

isolating and identifying the point(s) in your scripts where something works not as expected,
identifying the data causing havoc,
correcting them both.

debugging is the toughest part of any development, but it is a step that cannot be avoided. we must do all in our power to deliver bug-free code that works as expected. period.

in the following articles, i'll show you tips and tricks to make debugging easy.

rule zero: if you use any languages, technologies and extensions, READ THE MANUAL before you start coding away -- you can save yourself tremendous time and effort. especially consult the known bugs/errata/issues part of the documentation, because it is very possible that even if your code and data is bug-free, the system you are using might contain inherent faults which you have to circumnavigate to get the desired results.

for PHP and AS3, using the online documentation is a treasure, because they contain user-written additions that might point out inherent flaws or special behaviour not necessarily found in the official documentation.

now, let the series commence!

the trouble with babylon

developing multi-language apps -- even webapps -- can be a pain in the rectum, to say the least.

when working with PHP, you have a multitude of choices:

0. langauge-dependent conditionals
pros: none. none. okay, maybe the fact that they are integrated within the code, but that's it.
cons: about everything. for one, bloated and clunky code.
usage scenario: DO NOT EVER DO THIS.

1. array-based dictionaries
pros: native solution, fast for smaller wordsets, easy to modify
cons: must keep track of item IDs, consumes a lot of processing resource above certain wordset sizes, clumsy to use, has to have proprietary implementation, editing tools limited to source code editing, dictionary must be present initially, hard to expand to a new language, new items must be added manually.
usage scenario: PHP apps that work anywhere, quick'n'dirty development.

2/a. database-backend dictionaries (major SQL servers)
pros: easy to manage (you can write any frontend of your liking to edit them), easy to expand to a new language, easy to query.
cons: needs a database backend, needs to have a database and a table set, dictionary must be present initially, has to have a proprietary implementation, db queries can be a resource issue, new items must be added manually.
usage scenario: PHP apps that work in most environments and where you have a database server and where performance is not a big issue; also, apps where new expressions may be added dynamically to the dictionary.

2/b. SQLite-backend dictionary
pros: all of 2/a plus the SQLite db is just a file, so it's easy to move and set up.
cons: needs a PHP installation with SQLite capability, and while not as big a resource hog as a major DB server, with large wordsets it can slow things down; also, most of 2/a.
usage scenario: same as 2/a.

3. gettext native PHP support
pros: the most widely accepted and universal i18n (internationalization)/l10n (localization) toolkit; lets you write a mostly native code with an initial language of your choice, and worry about translations later on; you can work with partial translations; supports singular and plural forms; a plethora of ready-made software to let you prep and edit translations, and even lets you outsorce translation; very easy to add new expressions.
cons: requires a bit of ahead-planning (ie. if you decide to use gettext, you must write code that utilizes the gettext functions), rigid resource file location scheme, a hassle to set up properly (environment variables, charsets, etc); if you use caching, it's lightning-fast, but any change to the dicitionaries requires a soft-restart of the webserver hosting the PHP interpreter; if you eschew caching, it's slow and a resource hog. requires you to use poEDIT, xgettext or an IDE that supports gettext strings extraction; context-based gettexting is not natively supported (seriously, why, PHP, why?!?!)
usage scenario: apps that have a LOT of strings to localize, apps that have outsorced translators, apps where the dictionary expands a lot by programming.

4. php-gettext software library
pros: almost all of the native PHP gettext implementation, and doesn't need soft restarts of the web server when the dictionary changes. also, less hassle with specifying languages and charsets; supports context-based gettexting.
cons: besides the caching and context problem, almost all of native gettext's hassles. performance can be an issue as it tries to wrangle the binary-form .mo files, and does not cache them. a lot of classes and files to include.
usage scenario: like gettext, but where context-gettexting or the caching-restart hassle is an issue.

needless to say, i wasn't too happy when i started developing my new framework (called wg5, there will be a lot of articles about it later) -- almost all solutions have a lot of cons that outweigh the pros. but never one to accept defeat and go for an uneasy compromise, i decided it was time for a third alternative to using gettext in php. ladies and gentlemen, please welcome the PDXMLang classes!

it's pure PHP5 OOP, so you can easily extend or override parts of it -- especially those which concern dictionary loading and caching. it's a single library file, and does not depend on any external resources -- except of course the dictionary .po files. also, it's only two classes, and one gets initialized only when there is no cache file/data available. it is a complete implementation of gettext features except for catalog-specific calls -- they are provided for compatibility, but behave exactly like the non-catalog-specific counterparts. namely, it provides the following methods: textdomain(), gettext(), _(), ngettext(), dgettext(), dngettext(), pgettext(), npgettext(), dpgettext(), dnpgettext().

it has its own system of path names, but being an OOP construct, you can easily override that for your purposes.

the main idea is that it interprets the .po files directly (therefore, no hassle with having to interpret binary .mo files), and creates a special hash-array representation of all the items, then caches (to a file -- but it's your choice if you want to cache it into memory with memcache or APC) the result either as a serialize()d text or a json_encode()d text, and in subsequent runs uses the serialized file to initialize the dictionary.

it supports plural forms in almost all officially-supported (by gettext) languages, using native PHP code dependant on the language specification. also, it supports minor tweaks of the official implementation.

the library also offers on-the-fly charset conversions using either mbconv or iconv. however, the whole thing is aimed at UTF-8, being the de-facto standard for charsets.

all documentation is provided within the source file, phpdoc-style.

i suggest using poEdit for both strings extraction and translation work: it's free, it's multi-platform, and it kicks ass in many ways.

and, to be fair to my library, here's:

5. the subpar daemon's PDXMLang gettext replacement
pros: full gettext implementation without the hassles of the original regarding caching, non-caching, and path layout structure and environment settings; best behaves in UTF-8, which is standard; compact and well-documented code library; very fast after first caching; offers tweaks of behaviour; very extensible due to pure-OOP programming style; charset conversion on-the-fly; fast plural form implementation.
cons: slow initial read-in time for .po files; consumes a bit more memory than other gettext implementations (but not much); plural-form detection is based on language, not .po file spec.
usage scenario: as with other gettext usages, except it's much easier. :)

have fun using it, and drop me a line here if you like it, use it, or have any issues with it.