Reply New Topic
2013/2/7 21:55:29
#1
Offline
Home away from home

Improving search efficiency and speed

I had a problem with some of my modules crashing when returning large result sets (eg. > 600 records), so I took a look at search.php. I think it can be made more efficient (or at least, module search functions can be adjusted to reduce the load it generates). This file runs each module’s search function in a loop if you have global searches enabled. The potential problems arise if you have deep searching enabled (which most people do).

The key issue is that search.php expects modules to return all search results, eg. if your news module has 384 results for a particular search, it should return 384 article objects. The result set is counted by search.php to determine how many ‘hits’ there were for each module, which is displayed on the initial search results page, and to build pagination controls on the ‘show all results’ page.

This is a very resource-intensive way to get a simple count of the results, and the search will crash out if the result set is too large to handle. We don’t actually need to retrieve more than the number of search result objects that is displayed per page (set in the search preferences, usually about 20).

It is more efficient to break the search into two queries, like we do with pagination controls:

i) A simple getCount() with the search criteria, to determine how many results there are.

ii) A getObjects() with $offset and $limit values set, to retrieve only those objects we actually need.

Since the module search function can only return one variable back to search.php, you can preserve the count information (and avoid having to modify search.php) by padding the results array to the same length with TRUE values (1). Just make sure you pad the offset value to the left and the difference to the right, so that the ‘real’ results are in the expected place in the array.

Returning a padded array is not an ideal solution, but it is still much less resource intensive than retrieving all the objects (especially when implemented over all modules on your site) and will let the site handle much bigger result sets. In the long term, I think it would be better to modify search.php to expect the first element in the results array returned by module search functions to be a result count, but this would require modules to be updated.

Looking at timers for a particular search query with 441 results on my site (searching a single module), I saw a significant speed improvement on the search page:

Before search improvement
- ICMS took 1.12 seconds to load
- Module display took 0.93 seconds to load

After search improvement
- ICMS took 0.357 seconds to load
- Module display took 0.136 seconds to load

For code examples, please see the attached zip file (including getPublicationsForSearch() in the handler).

A fix for the ‘show modules with no match in search results’ bug
I found a simple fix (I think I have submitted ticket in the wrong place sorry). Change search.php line 186 (trunk) to:

if (!$icmsConfigSearch['search_no_res_mod']) { unset($all_results[$modname], $all_results_counts[$modname]); }

Attach file:


zip improved-search.zip Size: 4.51 KB; Hits: 155


Edited by Madfish on 2013/2/10 18:31:56
Edited by Madfish on 2013/2/10 18:32:17

2013/2/7 23:52:51
#2
Offline
Home away from home

Re: Improving search efficiency and speed

Wow, very nice effect!


2013/2/8 6:06:05
#3
Offline
Webmaster

Re: Improving search efficiency and speed

This is definitely an improvement! I'll create a Git branch for this, so it can be handled in the merge request flow and be integrated into the core.

Is this change done on 1.3.x or 2.0 (not sure if that matters for the search, really)?

_________________

Me on Ohloh


2013/2/9 4:58:20
#4
Offline
Home away from home

Re: Improving search efficiency and speed

I haven't altered anything in the repository (for the bugfix), as I do not understand how the core is managed or how the merging stuff works. It works on both 1.3.1 and 1.3.4 though.

I made some errors in the attached files that messed up pagination of search results, I will post the revised files on Monday (and hopefully upgrade the gone native modules next week).


2013/2/10 18:32:58
#5
Offline
Home away from home

Re: Improving search efficiency and speed

Ok revised files now attached to first post (updated the zip file).


2013/2/11 0:06:32
#6
Offline
Home away from home

Re: Improving search efficiency and speed

Thanks for this update


2013/2/16 11:15:40
#7
Offline
Home away from home

Re: Improving search efficiency and speed

So I guess we could apply this to other IPF based modules?


2013/2/16 20:34:31
#8
Offline
Home away from home

Re: Improving search efficiency and speed

Yep its a fairly minor modification of the default search function generated by ImBuilding. Takes about 5 minutes to change and test. I've got it running site wide now and it makes a big difference (for searches with a large result set).

Ultimately though it would be better if we can agree on a convention for module search functions to return to search.php a total count of results (in addition to the subset of results actually required for display).

That would break search for existing modules though, so it might be something best left for when the legacy module compatibility gap finally happens.

For those of us who are still XOOPSers, I had a quick look and I think this also applies to XOOPS as well. I don't think they are retrieving a total count of results at all and are hard coding the maximum size of the result set (20) as well, probably because of the efficiency problem.


2013/7/22 20:30:08
#9
Offline
Home away from home

Re: Improving search efficiency and speed

Just used part of this to move ticket #452 into testing - the 'no results' problem

Let's open a task for other search improvements - I haven't seen one recently.

_________________

Steve Twitter: @skenow Facebook: Steve Kenow


2023/1/1 15:42:58
#10
Offline
Home away from home

Re: Improving search efficiency and speed

I was looking for something else and came across this post. Things have moved around a bit and I couldn't find a branch or a ticket about this. We still can benefit from this approach. I'll add a new issue in github, unless someone can find the original


_________________

Steve Twitter: @skenow Facebook: Steve Kenow


2023/1/2 4:22:12
#11
Offline
Webmaster

Re: Improving search efficiency and speed

I think it's safe to say that the ticket is gone on the mean time. It will be better to create a new one for that.

I would prefer to focus this change first for 2.0,eventually later it can be back ported to 1.5.x,but it is important to give precedence to the 2.0 branch now.

Search is an area where we can improve a lot, for example by making it easy on a core level to use external search providers as well, next to the standard search

 


_________________

Me on Ohloh


2023/1/2 19:00:36
#12
Offline
Home away from home

Re: Improving search efficiency and speed

It would definitely be good to see 2.0 come to reality. And for us to use what we have learned for the benefit of us all. It's even better when we can incorporate what community members share with us.

One of our core promises was to provide all users with a path to upgrade. It was the foundation for the development of the deprecated messages in debug. I have yet to comprehend what it will take to make a module compatible with 2.0, or if there is an upgrade from 1.3/1.4/1.5 to 2.x. Even with that commitment, 1.2 > 1.3 was a 'start over' for some webmasters (including me). There were modules that didn't make the transition and replacements need to be found (still). 1.3 > 1.4 dropped more modules because they weren't updated by their developers and they haven't been adopted by anyone else.

As a founder and one of the developers I take responsibility for this. I am also a stakeholder and have sites that rely on what we produce. Don't get me wrong - what's been happening with 2.0 is going to be a great thing. I don't have the skills to have done any of it. I also lack the ability to explain it.

I've gone off topic for this thread - I apologize. I'll move to another topic where we can continue the conversation.


_________________

Steve Twitter: @skenow Facebook: Steve Kenow


2023/1/3 6:09:00
#13
Offline
Webmaster

Re: Improving search efficiency and speed

I split of this conversation to this new thread on the forums.



Edited by fiammybe on 2023/1/3 9:29:10
_________________

Me on Ohloh


Reply New Topic extras
 Previous Topic   Next Topic
You can view topic.
You can start a new topic.
You can reply to posts.
You cannot edit your posts.
You cannot delete your posts.
You cannot add new polls.
You cannot vote in polls.
You cannot attach files to posts.
You can post without approval.