Random database results with ColdFusion

Today I was looking at returning random records purely from MySQL but was limited with large tables in getting all random results (not just a random starting seed).

I tried a ColdFusion solution instead.

First get all the primary keys from the table in question and create an array of the values (this works for numeric and non-numeric keys):

<cfquery name="qGetPKs" datasource="dsn">
	SELECT 		Id
	FROM 		myTable
</cfquery>

<cfset aIds = listToArray(valueList(qGetPKs.Id)) />

Next create a list of random primary keys from our array:

<cfset numberOfRows = 20 />
<cfset ids = "" />

<cfloop from="1" to="#numberOfRows#" index="i">
	<cfset ids = listAppend(ids, aIds[randRange(1, arrayLen(aIds))]) />
</cfloop>

With smaller tables you may need to do this step a little differently in case the randRange() returns duplicate Ids.

Finally we pass the list of random primary keys to the WHERE clause via cfqueryparam:

<cfquery name="q" datasource="dsn">
	SELECT 		*
	FROM 		myTable
	WHERE		Id IN(<cfqueryparam cfsqltype="cf_sql_varchar" list="true" value="#ids#">)
</cfquery>

Depending on your situation you can cache the array of primary keys (in session or application scope) and refresh the list whenever your database changes. Then generate your random list of Id’s to pass to a query.

Without caching the list array (and using a table with 56,000 rows) I got the following times from the entire block of code (including getting the Id list):
cftimer

When I cached the array in application scope, cftimer was giving me 0ms :)
cftimer2

This was of course on my local machine. Although these results could be more accurate in a proper environment, they do show another solution in retrieving random query results.

Post a Comment or Leave a Trackback

7 Comments

  1. March 21, 2008 at 12:21 pm | Permalink

    You are open to duplicate rows.
    Try something like:

  2. March 21, 2008 at 1:10 pm | Permalink

    I note that this means you can’t use an ORDER BY clause in your SQL statement, which may be quite a disadvantage in some circumstances. If you can loop through the query with a cfloop instead of a cfoutput, you can achieve random results much faster and with only one hit on the database by just looping through a randomized list of query rows:
    http://www.mollerus.net/tom/blog/2008/03/creating_randomlyordered_lists.html

  3. March 21, 2008 at 2:02 pm | Permalink

    @Kris – your comment didn’t quite come through.

    I did say that duplicate records could be returned with randRange(), is that what you meant?

  4. March 21, 2008 at 2:30 pm | Permalink

    @Tom – Why would you not be able to use an ORDER BY clause?

    All we’re doing is saying “get me all the records which have ‘x’ as the primary key”. You can still use ORDER BY anything you want.

    Also I’m not quite sure I get your post (in terms of a query). Will that not return a random list of row numbers? How will that help when looping to output a resultset?

  5. Kevin Sargent
    March 23, 2008 at 12:29 am | Permalink

    I’m going to guess that a stored proc would be even better code-wise and speedilicousness-wise. have you tested that? I ask sincerely, because I will be doing a lot of random row queries on a big table soon and wanna know :)

  6. November 19, 2008 at 12:56 am | Permalink

    This was very helpful. I am a PHP coder who only today started working on some CF scripts for a new project. I managed to get a random record script up and running with this code.

    Thanks!

    David

  7. April 16, 2010 at 6:39 am | Permalink

    I’m using a different method, once i get my ids I do the following, my list is randomised everytime but the results are ordered in id ascending order!:(

    SELECT * from mytable where id IN ()

One Trackback

  1. [...] History is what was written down. « Options for accessing the ColdFusion 8 documentation Random database results with ColdFusion [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*