Improving search efficiency and speed

Posted by Madfish on 1360302929
I had a problem with some of my modules crashing when returning large result sets (eg. > 600 records), so I took a look at search.php. I think it can be made more efficient (or at least, module search functions can be adjusted to reduce the load it generates). This file runs each module’s search function in a loop if you have global searches enabled. The potential problems arise if you have deep searching enabled (which most people do).

The key issue is that search.php expects modules to return all search results, eg. if your news module has 384 results for a particular search, it should return 384 article objects. The result set is counted by search.php to determine how many ‘hits’ there were for each module, which is displayed on the initial search results page, and to build pagination controls on the ‘show all results’ page.

This is a very resource-intensive way to get a simple count of the results, and the search will crash out if the result set is too large to handle. We don’t actually need to retrieve more than the number of search result objects that is displayed per page (set in the search preferences, usually about 20).

It is more efficient to break the search into two queries, like we do with pagination controls:

i) A simple getCount() with the search criteria, to determine how many results there are.

ii) A getObjects() with $offset and $limit values set, to retrieve only those objects we actually need.

Since the module search function can only return one variable back to search.php, you can preserve the count information (and avoid having to modify search.php) by padding the results array to the same length with TRUE values (1). Just make sure you pad the offset value to the left and the difference to the right, so that the ‘real’ results are in the expected place in the array.

Returning a padded array is not an ideal solution, but it is still much less resource intensive than retrieving all the objects (especially when implemented over all modules on your site) and will let the site handle much bigger result sets. In the long term, I think it would be better to modify search.php to expect the first element in the results array returned by module search functions to be a result count, but this would require modules to be updated.

Looking at timers for a particular search query with 441 results on my site (searching a single module), I saw a significant speed improvement on the search page:

Before search improvement
- ICMS took 1.12 seconds to load
- Module display took 0.93 seconds to load

After search improvement
- ICMS took 0.357 seconds to load
- Module display took 0.136 seconds to load

For code examples, please see the attached zip file (including getPublicationsForSearch() in the handler).

A fix for the ‘show modules with no match in search results’ bug
I found a simple fix (I think I have submitted ticket in the wrong place sorry). Change search.php line 186 (trunk) to:

if (!$icmsConfigSearch['search_no_res_mod']) { unset($all_results[$modname], $all_results_counts[$modname]); }

Attach file:

zip Size: 4.51 KB; Hits: 319

This Post was from: