Using token_get_all() and better source highlighting

I recently discovered a very useful PHP function called token_get_all(), which allows you to tap into the Zend Engine that parses PHP (Written in C, so very fast). The function accepts a string, containing PHP code and will return the tokenized output as an array. The array will contain many elements, each of which may be a single character (Such as =, ;, or even “), or an array containing 3 values: The token type, represented as an integer, the token text itself (a T_COMMENT token would contain the actual comment), and the line number that the token started on. Hint: You can get the “nice” token name by calling the token_name() function on the token type.

This function allows you to do a lot of different things, such as built powerful debugging capabilities, source code coverage tools, or build an awesome syntax highlighting library like I did. Most libraries just use some basic regex, and you end up with OK syntax highlighting, but nothing special. Using the library I built, you can easily get syntax highlighting that rivals editors like Sublime Text 2 or Eclipse. I’ve put it on GitHub, so go check it out!

Or check out this example of it in action.

My next mini-project will be writing this as a WordPress plugin 🙂

3 Replies to “Using token_get_all() and better source highlighting”

  1. Keith Washburn says: Reply

    Hmm it looks like your site ate my first comment (it was extremely
    long) so I guess I’ll just sum it up what I submitted and say, I’m thoroughly enjoying your blog.
    I as well am an aspiring blog writer but I’m still new to the whole thing. Do you have any recommendations for inexperienced blog writers? I’d really appreciate it.

  2. Celesta Robbins says: Reply

    cheers for the advice

  3. Doris Hardesty says: Reply

    Good information. Lucky me I found your blog by accident (stumbleupon).
    I have bookmarked it for later!

Leave a Reply