chapter31

development in a land far far away…

at the moment

History is not what happened. History is what was written down.

Yesterday I posted on a handy MySQL feature where you can combine an INSERT and UPDATE on a record in the one statement.

Someone asked about the performance of this so I ran a little test to get some basic numbers.

One thing to note is that even when MySQL is actually UPDATING your record (ON DUPLICATE KEY UPDATE), the statement is being executed on the database as an INSERT and not an UPDATE statement. MySQL Administrator provides a good view of these types of database actions. Not sure if this means anything, but it might be something to watch out for.

I ran a (not so clinical) test by looping over 1000 INSERTs with ON DUPLICATE KEY UPDATE, and then simply using an UPDATE (again 1000 times). I used getTickCount() to provide measurements, restarting MySQL after each type of test.

This ran on CF8, MySQL5 and Ubuntu.

INSERT ON DUPLICATE KEY UPDATE

448
426
356
320
404
387
427
355
433
350
Avg: 391

UPDATE

353
328
356
299
363
342
434
302
342
416
Avg: 354

So although my test wasn’t done under the most optimal conditions, we can see that at first using a simple UPDATE is the quicker of the two.

But…

We still need to know whether we’ll be performing an INSERT or UPDATE. There are of course different ways to tackle this solution, let’s assume that we’re first running a SELECT statement against the database to see if the record exists. The SELECT retrieves 1 column, the primary key value, and of course the WHERE clause uses the primary key as a condition which is a clustered index so should perform quite quickly.

If no record is returned then we’ll be INSERTING, if we get a record then we’ll be UPDATING. Bear in mind that I really dislike this approach, but a lot of people use it so I thought it would be interesting to see the performance of running 2 queries against the database. 1 SELECT, then either an UPDATE or INSERT.

Initial SELECT, then UPDATE

535
487
605
510
534
559
480
515
560
502
Avg: 529

So the cost of running 2 queries makes everything a lot slower than either of the first 2 options.

I’d say using INSERT with ON DUPLICATE KEY UPDATE is a good solution for most situations. But if you really want to get the most out of your application then use an UPDATE but have a separate way to tell if your record exists or not (an empty primary key property in a bean etc).

A little while ago I mentioned combining multiple INSERT statements with MySQL, here is another helpful approach to a common problem.

In many systems you will give the user a chance to ADD and EDIT a record, a News post for example. As usual there are many ways to handle writing any changes to the database, one common approach would be to have 2 methods in a News.cfc

E.g. create() and update()

create() would have an INSERT statement and update() would have an UPDATE statement.

That works well but means you have to have 2 methods for each type of content you want to save to the database.

How about this instead:

From the docs:

If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that would cause a duplicate value in a UNIQUE index or PRIMARY KEY, an UPDATE of the old row is performed.

Now you can just have this query in one method (e.g. save()) which will handle both use cases.

This feature has been available since MySQL 4.1.0

Apparently the CAPTCHA for Windows Live Mail can be cracked in as little as 60 seconds and Gmail’s CAPTCHA has also been cracked.

Does this mean the beginning of the end for CAPTCHA?

There are some other interesting solutions to spambots out there, it’ll be interesting to see the future landscape of this infestation and the methods used to prevent it.

Read more here

Those that use the sensational varscoper tool from Mike Schierberl know that cfscript is still a little experimental with the current release.

One of the things it can do is return variables that are actually “var” scoped as not var scoped when using cfscript. This seems to happen most often when you have a comment directly above the variable in question.

In the above example Varscoper tells me that “newRow” is not var scoped when clearly it is.

This is usually fine, but if you start to get a lot of these false positives there is a quick little workaround. Place a semi-colon at the end of the last comment which effectively tells varscoper where the statement finishes and fixes the problem.

Note the semi-colon after the 2nd comment. Now my reports are nice and clean :)

When you use WebServices with ColdFusion, the WSDL ’stub’ is cached in CF Administrator (presumably for performance reasons).

This can be a pain when you need to change the WSDL and you don’t have access to CF Admin to flush the service which is the scenario I found myself in today.

Luckily for me I sit near some ColdFusion peeps who sorted me out with a little Java method (courtesy of the ServiceFactory) to flush a WSDL from a ColdFusion template.

Basically all you need to do is:

I also wrapped it up in a little function:

Thanks Marko (and Mark)

After the fact I did some googling and found a great read from Doug Boude on Refreshing Cached ColdFusion Webservices Through the Back Door in case you’re interested.