Archive for November, 2009

Running C/C++ Code as a CGI Script

Wednesday, November 25th, 2009

When I first had the realization that this was not only possible, but really simple, I was very excited. Using C/C++ to create dynamic web applications isn’t new. In fact, C and Perl were originally the primary method for dynamic web based applications. Well, the Internet has been around long enough, that even Perl is beginning to become obsolete (Thank God).

C++ Web applications are not going to be a game-changer. If this were true, it would have happened a long time ago. Will I forget about PHP? Of course not! Many (probably most) shared hosting solutions won’t even allow you (with very good reason) to execute arbitrary binary files on their servers. Mine won’t. You also can’t (to my knowledge) do this technique on Windows servers.

I WILL, probably, write a few C++ web applications to run on my laptop, just because I’m a nerd, and the satisfaction I’d get if from just knowing every time I hit the Home button, I’m seeing web pages dished up by a C++ script… I’ll stop there.

What kind of solutions do I see this as a replacement for PHP (or Ruby, or Python, or ASP, or JSP, or whatever you use)?

  • You are looking to build a robust web application to be run on your own private web servers.
  • I’m thinking, if Twitter was written in C/C++, we wouldn’t have as many Twitter-outages :) .

  • You are looking to build a high-end distributable web-based software package.
  • For example, a software package, where the codebase sits on a server, and clients (or internal employees), interface with it through a web browser. Now we’re talking APPLICATION, more than just a website.

  • You are looking to build a web application, and have the ability to run arbitrary binary scripts on your server (such as a private server), and runtime speed is crucial.
  • With FastCGI and precompiled binary scripts, well-written C/C++ code will trump compile-on-the-fly approaches of PHP, Perl, Python, etc. (Of course PHP has memcache..)

Enough jibber-jabber. Let’s create a C++ CGI script! First, you’ll need to configure Apache to execute CGI scripts. It’s general practice (but you may not care.. I don’t) to create a single directory, and only allow CGI scripts in that directory to be executed. You’ve probably seen a lot of cgi-bin/ directories on various websites. Let’s say we create a directory at /var/www/cgi. In this directory, we’ll put our CGI scripts. Let’s tell Apache.

You’ll want to edit your httpd.conf file (on Ubuntu, it’s in /etc/apache2). Add this (as root/sudo):

<Directory /var/www/cgi>
    Options ExecCGI+
    AddHandler cgi-script .cgi
</Directory>

The Options ExecCGI+ line is the one that allows CGI scripts to be executed. If instead of using a single directory, you opted to make the whole ServerRoot allow CGI scripts (like I did), you’ll want to make sure the Options aren’t overridden elsewhere in the server conf. Namely, check for something in /etc/apache2/sites-available/default (or where ever else your system may store Apache config). In this file, You may see another <Directory> block for your server root. Add ExecCGI to the Options list.

You can also create arbitrary file extensions for your CGI scripts with the AddHandler directive. Imagine the possibilities.

Now, restart Apache. On Ubuntu:

sudo /etc/init.d/apache2 restart

When Apache comes back up, you should be ready to roll. You may feel like throwing in a test Perl script before we get to the C++, just to make sure things are working as expected. If you aren’t a Perl Monk (most of us aren’t), do this:

which perl

Will tell you where perl is installed (if at all). It’s probably /usr/bin/perl. So then, create this Perl script:

#!/usr/bin/perl

print qq(Content-type: text/html\n\n);
print qq(Hello, world!);

Make sure to chmod that bad boy to at least 755, and hit it in the browser, you should see “Hello, world!”. If not, you probably got one of these:

  • You saw the perl code
  • That means the CGI script didn’t attempt to execute, check back over the steps, make sure you restarted Apache.

  • You got Forbidden
  • You either aren’t allowed to execute CGI scripts, or didn’t get the right permissions.

  • Internal Server Error
  • Perl code is probably messed up. Check out tail /var/logs/apache2/error.log for what SHOULD be a more detailed error message.

  • File not found
  • You probably have a typo in the filename or the address bar :) .

Well, hopefully you have that working now, lets throw down some C++ code.

I’m not going to teach you C++, so if this code doesn’t make sense, you should look into learning C++ before approaching this technique (obviously).


#include <iostream>
#include <cstdlib>

using namespace std;

int main() {

	cout << "Content-type: text/html\n\n";
	cout << "Hello World (Wide Web)<br />" << endl;

	cout << getenv("REMOTE_ADDR") << endl;

}

Save this as hello.C, or whatever you want.. and compile the code:

g++ -o hello.cgi hello.C

Now, make sure that hello.cgi is in /var/www/cgi (or wherever you specified), and hit it in the web browser. You should see an output something like:

Hello World (Wide Web)
127.0.1.1

One of the biggest pitfalls I can foresee, is that server-side scripting is not an interactive technique. Thats why scripting languages are perfect for dynamic web pages. C++, not being a scripting language by nature may cause you some headaches. Just be sure to write smart, efficient code.

You can also download a C++ CGI library, to help out with accessing header data, such as Cookies, GET and POST variables, etc. Here is a link to an ANSI C library for CGI Programming.

Protecting Your Web Application’s Code

Wednesday, November 25th, 2009

If you have ever considered commercial web application development, you’ve probably faced the challenge of protecting your intellectual property. I’ve spent the past few months researching and pondering this very problem.

There are many possibilities and issues here. Dynamic web applications are generally written in a client-side scripting language. The nature of these scripting languages is to compile on-the-fly. This means you store the code, in plain-sight. Not very good when you’re trying to sell software, and anyone who purchases it has the ability to reverse engineer your product.

Some technologies, such as JSP, allow you to compile the code down to bytecode, however, by nature of its design, Java bytecode is compact and simple to reverse engineer. Encryption techniques such as ZendGuard are crackable (unencryption has to happen somewhere). ActionScript (Flash) is promising, it compiles down to binary SWF files, however, tools exist to convert these SWF files into their FLA counterparts.

There may be no guaranteed, fool-proof way to protect your code, but one thing that has obviously worked well for stand-alone software vendors is binary compilation. Great, so how do I compile my web code to binary? Simple! Just write your web applications in C/C++ (or other language that compiles to binaries), and run them as CGI scripts.

I had a major AH-HA moment, when I realized that all a CGI script needs to do, is print out the content MIME type, and the actual content, Apache will take care of the rest. This approach will only work on Unix based hosts, as Windows does binaries a little different (suckers). But as the vast majority of web hosts run on Unix, this isn’t a huge deal.

I’m going to create a 2nd post, demonstrating this technique. Look for it in the not-so-distant future.

Update: The 2nd post is up! Running C/C++ Code as a CGI Script

Bought a New Book about Website Optimization

Tuesday, November 24th, 2009

Stopped by Borders yesterday, and picked up a new book, cleverly titled Website Optimization, by Andrew B. King.

Website Optimization, by Andrew B. King

I was looking for a book which covered SEO, and this fits the bill. O’Reilly books are pretty well known for their quality in the techie field. Paraphrased from the back cover, here is what it has inside:

  • “Best practices to improve search engine rankings”
  • “Keyword optimization and guerilla PR techniques”
  • “Optimize pay-per-click campaigns”
  • “Maximize conversion rates by using landing page guidelines to increase leads and sales…”
  • “Tune website performance by utilizing XHTML, CSS, and Ajax techniques…”

Glancing through, it looks pretty solid, and has a 5-star rating on Amazon. It’s definitely worth checking it out, and the current price of $26.39 (PLUS FREE SHIPPING!) is a much better deal than the $39.99 sticker price I paid.

I have a lot of time off work coming up soon, due to the holidays, vacation, etc. I hope to get through the book soon, and will definitely post my reaction.

Worth Blogging About

Monday, November 23rd, 2009

Well, I thought I’d take a moment to bring up some news. Saturday, I officially made my first move towards becoming a successful affiliate marketer, with my first day of profit. It was only about $20 worth of profit, but, I hope to see that number grow, especially as the holiday shopping season kicks into full gear.

Perhaps, as I see more success, I’ll make more posts about my affiliate marketing ventures. Yeah, one of those guys.

PHP Bitwise Operators and Access Control Lists

Friday, November 20th, 2009

Bitwise operators are a very handy tool that can be used in PHP.  The problem is they just aren’t used very often.  I really like bitwise operators, but like other PHP developers, I just don’t use them very often.  I think the cause of this is that developers just don’t realize when they COULD be using bitwise operators.

A very good example of where you should be using bitwise operators is in Access Control Lists (ACLs).  An ACL is basically a list of who has access to what.  Well, a simple way to introduce you to access control is via Unix permissions.  Most of you are probably familiar with the chmod command.  Most of you also probably don’t have the codes memorized.  Here is a great little PHP script that will:

A: Help you understand bitwise operators and how you might use them, and
B: Help you understand and memorize the chmod codes.

define("EXECUTE", 1);
define("WRITE", 2);
define("READ", 4);

for($i = 0; $i <= 7; $i++) {
$x = ($i & EXECUTE) ? "x" : "-";
$w = ($i & WRITE) ? "w" : "-";
$r = ($i & READ) ? "r" : "-";

echo "{$i} = {$r}{$w}{$x}<br />";
}

This will output:

0 = –––
1 = ––x
2 = –w–
3 = –wx
4 = r––
5 = r–x
6 = rw–
7 = rwx

A little note, about chmod, just in case you were wondering now: a chmod code is three digits. The first digit is the access code for the file’s owner, the second digit is the access code for the group of the file’s owner, and the third digit is the access code for everyone else. So, let’s take a code: 754. What does this mean? Well, let’s use that list we just created to look it up. The first digit, 7, as stated, maps to the file’s owner, so according to our list, the owner has full permissions, rwx. Next, the second digit is a 5, and that maps to the file owner’s group; the group has r-x access: read and execute, but no write permissions. The final digit, everyone else, is a 4: read-only access.

Now, how can you apply this to your code? Well, as long as you keep the integer assignments as powers of 2, you can have an infinite number of access codes:

define("MAGIC", 8);

$wizard = READ | WRITE | EXECUTE | MAGIC;

if($wizard & MAGIC) {
echo "Wizards can do magic <br />";
}

One Month Milestone

Wednesday, November 18th, 2009

It’s now been over one month since I started this blog.  Just thought I’d point that out.  Have I made any valuable posts yet? I’d like to think so, but thats up for you to decide.  Aside from my blog, I’ve been working diligently on a few side projects, and some extra pages for my SeanJordan.me domain.

I’m that kind of person that always has 4,000 “entrepreneur” ideas going on in the back of my mind.  As I start unrolling some of these, I’ll be sure to let you in on them.  I have a good one coming up that I’d like to share, so be sure to stay tuned.  I’ll share as much as I can on my process of packaging up the finished product, how I market it, and any success I have.

It’s been one whole month, and I’m just now getting to 10 posts.  I’d hoped I would do better, I’ll try to in the future!

Ubuntu 9.10 – Karmic Koala Reviewed

Saturday, November 14th, 2009

Released at the end of October, “Karmic Koala” is the latest release of Canonical’s hugely popular user-friendly Ubuntu flavor of Linux.  I’ve been an avid Ubuntu user since 6.06.  I’ve been a huge fan.  It’s Linux, so I have the power of the command-line.  I like to do things via command line when possible.  I guess it just makes me feel like I’m more in control.

Last night I finally got around to getting 9.10 installed.  Since I just started to learn Flex, I decided to go with the 32bit OS, since Flash’s 64bit Linux support isn’t that great.  Flash 10 introduced Adobe Flash Player to the 64bit Linux Kernel, however its still just in Alpha.  I also used the alternate “text” installer.  This is the version to use if you want to get the install completed quicker, or set up an LVM.  Also, thanks to my good buddy, Wikipedia, I learned that a LOT of users are experiencing problems when upgrading from 9.04.  Good thing I chose to do a clean install.  Typically, I won’t upgrade until the new release has been out for a while, but I wanted to do a review, so I decided to install from scratch.  I’m glad I made that decision.  I wasn’t thrilled with 9.04, however, so I couldn’t wait to try out 9.10.

Years ago, Mark Shuttleworth proclaimed that ["Pretty" is a feature].  Understanding the trend of the “wow” effect to impress users, Ubuntu comes “pretty” out of the box… unlike most Linux installs.  Minus the added files and links on the taskbar, here is how mine looked, out-of-the-box:

Ubuntu default background and theme

After playing around for a while, I got things comfortable.  Got to have those transparent terminals, people!

My Ubuntu 9.04 Karmic Koala Desktop

My favorite new feature is Ubuntu One.  In a nutshell, it automagically ryncs the contents of my ~/Ubuntu One folder to Canonical’s cloud servers.  For FREE, users are given a 2gb limit (and more can be purchased via monthly subscriptions).  I can access my files anywhere via web-browser by logging into the interface.  I’m still working on learning exactly how it works.  But I picture using this feature almost daily.  I’m working on figuring out how to sync my /var/www (apache ServerRoot) directory, so I can access my web projects from anywhere real-time.  I could have set this up myself using on of my own domains, but this is way cooler.  Once I’ve learned more about the system, I’ll probably dedicate a post to it.  This seems like the type of service that can set Ubuntu apart from the crowd.  It seems that the sync doesn’t follow symlinks, so I created a cronjob to locally rsync the contents of my /var/www folder to my /home/sean/Ubuntu\ One folder.  One of my complaints, is the use of a directory name that requires escaping.  This is one of my pet-peeves.. but for now it is the only folder that can be synced.  I read in the plans, where it was mentioned that allowing the syncing of other folders would be good.. it even used /var/www as an example folder that people would like synced.

On a less exciting note, it seems the kernel version I’m using (Linux version 2.6.31-14-generic (buildd@rothera) (gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu8) ) #48-Ubuntu SMP Fri Oct 16 14:04:26 UTC 2009) seems to have brought back the troubles of Sleep/Hibernate in Linux.  Every time I wake my laptop up (HP G60), I get one of these pretty icons in the top right of my screen:

Crash Report Detected

How exciting!  So I figure, lets see whats up.. I click on the icon, and thats when I discovered it.  My least favorite new feature.  A long-time staple in Redmond’s operating sytsems: the worthless error message:

Helpful Error Message

I would at least like to know what process triggered the error!! But, I get it, Ubuntu is targeting users new to Linux, and providing them with that kind of information might be a bad idea.

I decided to report the error, just to see the process.. which eventually told me that it was indeed related to the sleep functionality. Here is the error from my syslog:

Nov 14 17:55:37 ubuntu kernel: [12229.569342] WARNING: at /build/buildd/linux-2.6.31/kernel/power/suspend_test.c:52 suspend_test_finish+0×80/0×90()

As far as load times, etc, I don’t really have much to report.  The start-up time seems to have improved greatly, which is good.

Overall, I’m pretty happy with the latest Ubuntu.  Most things just worked after the install.  I had to install the nonfree nvidia driver, not a big deal.  On my old laptop, I was still using 8.04, which has been my favorite version.  It also happens to be the most recent LTS (Long Term Support) release.  Luckily, the next release, 10.04, scheduled for 04/2010, Lucid Lynx will be the next LTS release.

Latest reading..

Monday, November 9th, 2009

Flex for PHP Developers

I’m learning to work with Flex for my latest software package.  The possibility for integration with PHP makes me happy.  Ugh, I’m a nerd. I love code.  Be on the lookout for awesome applications in the future, I’m about to get dangerous.

Prepared statements are not just for security

Thursday, November 5th, 2009

Prepared SQL statements are supported by a lot of database abstraction drivers. Prepared statements are great. If you aren’t using prepared statements, you should seriously look into it!

Prepared statements are immune to SQL injection attacks. That’s right, immune. When you use prepared statements, you don’t have to worry about properly escaping inputs, it is handled for you.

Aside from the security, prepared statements when you want to perform a query multiple times with different parameters.  The structure of prepared statements is intended just for that.  In fact, its the idea behind prepared statements.  Prepare the statement, run it several times with new sets of parameters.

The downside of prepared statements is the execution speed. Before the query can be executed, it is “prepared” and parameters must be bound. In a high-load setting, the increased execution time might be noticeable, but for average instances, its negligible.

When digging through old, crappy code, it is pretty common-place to see developers incorrectly using prepared statements. How is this possible? Is it a vulnerability issue? Well, one of the main ideas behind a prepared statement, is that the statement may need to be executed several times, but the statement only needs to be “prepared” once. If you have a loop which executes a query, prepare the statement before entering the loop. Inside the loop, you bind the parameters and execute. Don’t prepare the statement inside the loop.

Preparing a statement is a string manipulation, doing it multiple times is extra load on precious CPU time. I created a simple MySQL schema and PHP script to test this scenario, to get an idea of the extra execution time resulting in this improper usage of prepared statements. The table I used simply had 3 fields:

id int unsigned not null auto_increment primary key,
hash varchar(255) not null,
hashType tinyint unsigned not null

The simple table and the fact that my laptop has virtually no load meant that the queries ran FAST. 20,000 executions took place in ~3 seconds. I upped the number to 300,000 queries. Preparing the statement 300,000 times resulted in a script execution time of 35 seconds (average over 3 trials), and preparing the statement once and then executing 300,000 times resulted in a script execution time of 33 seconds (average over 3 trials).

2 seconds isn’t significant, no, but this was on a dual-core laptop which saw the load peak at 0.61 during the trials. Imagine this running on your shared hosting database. On mine, I saw a 3 second difference when only running 5,000 queries.