A quick tip with how boolean expressions are handled in PHP

I can’t say this is specific to PHP, but this is another thing I see in code a lot, that kind of drives me nuts.


if ($some_var == 3) {
   return TRUE;
} else {
   return FALSE;
}

Because the == is a logical operator, the operation returns a boolean value. Boolean expressions like this always return TRUE or FALSE.

Ie, your code could be simplified to:

    return ($some_var == 3);

It works with more complex examples as well:

    return ( some_fun() && $some_var == 3 || $foo == 'bar');

God, it seems like a lot of my posts are just complaining about the way other people write code. You’d think it’s almost as if I write perfect code. I now must take the time to admit that this is not the case. Also, I can’t claim that coders who write in the style I just called out are wrong. I just don’t like it. :)

Welcome to the new host

You may have experienced some down time earlier today.

I’ve moved to a new hosting solution!

It’s not permanently settled, I’ve still got a lot of configuration to do, so service may be intermittent.

Thanks!

Feedback wanted on email regex

I just wrote an email regex. If you don’t know much about regular expressions, don’t use this regular expression on your website. I’m not claiming this regex to be amazing. I just sat down to write one, and I want some feedback.

If anyone can analyze it, and maybe tell me what I am not thinking about, I’d appreciate it. That’s all I want: to know what I am NOT thinking about. My hope is, the insight from others may help me get better with this type of problem. I’m not a regex rookie, I just thought it would be a nice exercise. Gah.

There, now that all of that defending myself (probably for nothing) is out of the way, here it is:

/^[a-z0-9]+[a-z0-9-_.]*\@[a-z0-9]+[a-z0-9-_.]*\.([a-z]{2,4}|museum|travel)$/i

I purposely ignored the top level domains along the lines of XN–MGBERP4A5D4AR, because, well, I don’t want email from people on those domains?

Thanks!

Theorem #1

There exists an MD5 (or any other hashing algorithm, which produces fixed-length strings) sum, such that for some 32 character hexidecimal “string” (32 characters, when used with MD5), a, and another string, b of unknown length and character set, where the sum of the concatenation of a and b is identical to a.

There just has to be. It may just be nearly impossible to find.

In much simpler terms, I’d state it like this:

$a = md5($some_string); //give me some 32 character hexidecimal "string"
$b = $message_from_god;
$c = md5($a . $b);

if($a === $c) { //a and c are exactly identical!
   die('The world will now shatter to pieces.");
}

Make sense? I can’t prove it, but I feel like it HAS to be true, for b of unlimited length.

Coding Pet-Peeve #1

Variables declared, just for the sake of being returned.

Example:

function foo() {
    $a = 'bar';
    return $a;
}

Try this, instead:

function foo() {
    return 'bar';
}

Simple, but this drives me nuts, and I see it all of the time.

The Black Keys

Just taking a quick moment for a public service announcement.

A few months back, Pandora introduced me to a great band. I’ve been listening to them more, and more.

If you are unaware of The Black Keys, now would be a good time to educate yourself. Amazon has song samples from all of the albums. Totally worth a few minutes of your time.

That is all.

One hundred pushups – week 3

Wow, so this is pretty much all I post about anymore. I don’t like that, and I don’t want to spam my blog up with posts like this. I also don’t want to look like a quitter :-D .

So, I have to post on here, just to say that I haven’t quit on the pushups. Tomorrow, I’ll be doing week 4/day 2. Just eight more days of pushups, and I’ll be through the six week program! On top of this, I also started doing more for strength. I just started lifting weights (on the days I do pushups), to make sure my arms are getting a real workout. I don’t know if it will hurt my pushup counts, but I’d like to think not.

Anyways, just a “for me” post.

kthxbye

One hundred pushups – week 2

Today was the end of my 2nd week of “One Hundred Pushups“! The second week ends with an “exhaustion test” – to see how many pushups you can do before your arms physically cannot do more. Proud to say that, after just two weeks, my number went up by 10 whole pushups! :)

Pumped and ready for week 3!

Modulo Bias

Programmers are often faced with a problem of picking a random number within a given range. The most common way to solve this is to take a random integer and compute the modulo of some smaller number.

For example, assume the function rand() returns a random integer between 0 and an arbitrary maximum value, m. For some number n less than m, to compute a random number between 1 and n we do the following:

r = rand(0,m) % n + 1

This is fine enough, for most cases, however it CAN introduce what is called a modulo bias. A modulo bias can be visualized, quite clearly with a little help from the pigeonhole principle, which states:

Assume you are given n number of pigeonholes, and p > n number of pigeons. If you put each pigeon into a pigeonhole, there must exist at least one pigeonhole which has at least two pigeons.

Pigeonhole Principle

In other words, because there are more pigeons than there are pigeonholes, at least one of the pigeonholes must have at least two pigeons in it! Here we are compressing 10 pigeons into 9 pigeonholes.

This is an example of compressing a larger range (number of pigeons) into a smaller range (number of pigeonholes).

So, how does this come into play when dealing with a random number in a range? We are doing the same thing when picking a random number in a range. We are given the larger range, (0,m), and are trying to compress this down into a smaller range (x, n).

Assume each of our pigeons is wearing a collar with a number 1-10 written on it. Now assume we have a 10-sided fair die numbered 1-10 on the sides, which we are going to roll. We are going to select a pigeon at random by rolling the die. The pigeon with a collar numbered the same as the up-facing side of the die is the one we will select. Now, the pigeonhole which houses the winning pigeon is going to have food placed into it. Which pigeonhole is most likely to get the food?

Each pigeon has a 10% chance of being selected, so each pigeonhole has a (10% * number of pigeons in pigeonhole) chance of being selected. That is, the pigeonhole with two pigeons has a 20% chance of being selected, where the other eight only have a 10% chance of being selected. There is a bias towards the pigeonhole with the most pigeons in it.

When each pigeonhole has just one pigeon in it, each pigeonhole shares a common chance winning the food. That is, each pigeonhole has the same chance of winning, because each pigeonhole has the same number of pigeons in it. If we have 18 pigeons and nine pigeonholes, we can put exactly two pigeons in each pigeonhole. Each pigeon adds a 1/18 chance of winning to the pigeonhole it is housed in. Since each pigeonhole has two pigeons, this means each pigeonhole has a 2/18 = 1/9 chance of winning, and the game is now fair!

The same thing happens with the random numbers. Assume our programming language generates random numbers between 0 and m = 9. Now, assume we are faced with a task, where we must select a random number between 1 and 3

that is:

$r = (rand(0,9) % 3) + 1;

This code will give us a random number between 1 and 3, but the problem is that it introduces a modulo bias. Here, we are compressing the range (0,9) down into the smaller range (1,3). There are 10 different values in the larger range (the pigeons), and just 3 different values in the smaller range (the pigeonholes).

First, we know that at least one of our pigeonholes will have at least two pigeons in it. We assign each of our pigeons to the correct pigeonhole by using the modulo operation:

(0 % 3) + 1 = 1
(3 % 3) + 1 = 1
(6 % 3) + 1 = 1
(9 % 3) + 1 = 1
(1 % 3) + 1 = 2
(4 % 3) + 1 = 2
(7 % 3) + 1 = 2
(2 % 3) + 1 = 3
(5 % 3) + 1 = 3
(8 % 3) + 1 = 3

Can you spot the modulo bias? In this example 1 has a greater chance (40%) of being selected than either 2 or 3 (30%). Here, because we have n close to m, the effect is greater. As m gets much bigger than n the effect becomes less noticeable, but does still exist (The pigeonholes are populated in order, so in the worst case, pigeon counts may differ by one). To eliminate the modulo bias, we have to decrease m until it is evenly divisible by n.

That is, if m % n != 0, we have a modulo bias.

Solving this equation yields a new value for m, m’ with no modulo bias:

m’ = (m – x) / n = 0

To prove this, I wrote a short PHP script, that computes a random number as:

$r = (rand(0,9) % 3) + 1;

and created 100,000 random numbers 1-3, and came out with the following results:

Array
(
[1] => 40005
[2] => 30352
[3] => 29643
)

Total of 100000 iterations:
1 = 40.005%
2 = 30.352%
3 = 29.643%

Just as I predicted! Now, if we reduce m from 10 down to 9, so that m % n = 0, we see the results change to:

Array
(
[1] => 33234
[2] => 33413
[3] => 33353
)

Total of 100000 iterations:
1 = 33.234%
2 = 33.413%
3 = 33.353%

And that’s more like it!

One hundred pushups

So, tonight, I completed my first week of One Hundred Push-ups. Feeling pretty pumped about it, already feeling better.

Basically, it is a six-week program which is supposed to gradually increase your push-up stamina, until you are capable of performing one hundred consecutive push-ups. There is nothing ground-breaking about it. You do a few sets of push-ups, three days a week. The program tells you how many to do in each set, for each day. For example, this week, I did these sets: {10,12,7,7,9+}, {10,12,8,8,12+}, and {11,15,9,9,12+}.

The numbers gradually increase, until the end: week 6. The sets for week six look like this: {45, 55, 35, 30, 55+}, {22, 22, 30, 30, 24, 24, 18, 58}, and the final set {26, 26, 33, 33, 26, 26, 22, 22, 60+}.

Let’s just leave it at saying I’m pretty far off from being able to do week 6!