2013/2/7 21:55:29
|
---|
|
Improving search efficiency and speedI had a problem with some of my modules crashing when returning large result sets (eg. > 600 records), so I took a look at search.php. I think it can be made more efficient (or at least, module search functions can be adjusted to reduce the load it generates). This file runs each module’s search function in a loop if you have global searches enabled. The potential problems arise if you have deep searching enabled (which most people do).
The key issue is that search.php expects modules to return all search results, eg. if your news module has 384 results for a particular search, it should return 384 article objects. The result set is counted by search.php to determine how many ‘hits’ there were for each module, which is displayed on the initial search results page, and to build pagination controls on the ‘show all results’ page. This is a very resource-intensive way to get a simple count of the results, and the search will crash out if the result set is too large to handle. We don’t actually need to retrieve more than the number of search result objects that is displayed per page (set in the search preferences, usually about 20). It is more efficient to break the search into two queries, like we do with pagination controls: i) A simple getCount() with the search criteria, to determine how many results there are. ii) A getObjects() with $offset and $limit values set, to retrieve only those objects we actually need. Since the module search function can only return one variable back to search.php, you can preserve the count information (and avoid having to modify search.php) by padding the results array to the same length with TRUE values (1). Just make sure you pad the offset value to the left and the difference to the right, so that the ‘real’ results are in the expected place in the array. Returning a padded array is not an ideal solution, but it is still much less resource intensive than retrieving all the objects (especially when implemented over all modules on your site) and will let the site handle much bigger result sets. In the long term, I think it would be better to modify search.php to expect the first element in the results array returned by module search functions to be a result count, but this would require modules to be updated. Looking at timers for a particular search query with 441 results on my site (searching a single module), I saw a significant speed improvement on the search page: Before search improvement - ICMS took 1.12 seconds to load - Module display took 0.93 seconds to load After search improvement - ICMS took 0.357 seconds to load - Module display took 0.136 seconds to load For code examples, please see the attached zip file (including getPublicationsForSearch() in the handler). A fix for the ‘show modules with no match in search results’ bug I found a simple fix (I think I have submitted ticket in the wrong place sorry). Change search.php line 186 (trunk) to: if (!$icmsConfigSearch['search_no_res_mod']) {
unset($all_results[$modname], $all_results_counts[$modname]);
} Edited by Madfish on 2013/2/10 18:31:56
Edited by Madfish on 2013/2/10 18:32:17 |
2013/2/7 23:52:51
|
---|
|
Re: Improving search efficiency and speedWow, very nice effect!
|
2013/2/8 6:06:05
|
---|
|
Re: Improving search efficiency and speedThis is definitely an improvement! I'll create a Git branch for this, so it can be handled in the merge request flow and be integrated into the core.
Is this change done on 1.3.x or 2.0 (not sure if that matters for the search, really)? |
_________________
|
2013/2/9 4:58:20
|
---|
|
Re: Improving search efficiency and speedI haven't altered anything in the repository (for the bugfix), as I do not understand how the core is managed or how the merging stuff works. It works on both 1.3.1 and 1.3.4 though.
I made some errors in the attached files that messed up pagination of search results, I will post the revised files on Monday (and hopefully upgrade the gone native modules next week). |
2013/2/10 18:32:58
|
---|
|
Re: Improving search efficiency and speedOk revised files now attached to first post (updated the zip file).
|
2013/2/11 0:06:32
|
---|
|
Re: Improving search efficiency and speedThanks for this update
|
2013/2/16 11:15:40
|
---|
|
Re: Improving search efficiency and speedSo I guess we could apply this to other IPF based modules?
|
2013/2/16 20:34:31
|
---|
|
Re: Improving search efficiency and speedYep its a fairly minor modification of the default search function generated by ImBuilding. Takes about 5 minutes to change and test. I've got it running site wide now and it makes a big difference (for searches with a large result set).
Ultimately though it would be better if we can agree on a convention for module search functions to return to search.php a total count of results (in addition to the subset of results actually required for display). That would break search for existing modules though, so it might be something best left for when the legacy module compatibility gap finally happens. For those of us who are still XOOPSers, I had a quick look and I think this also applies to XOOPS as well. I don't think they are retrieving a total count of results at all and are hard coding the maximum size of the result set (20) as well, probably because of the efficiency problem. |
2013/7/22 20:30:08
|
---|
|
Re: Improving search efficiency and speedJust used part of this to move ticket #452 into testing - the 'no results' problem
Let's open a task for other search improvements - I haven't seen one recently. |
2023/1/1 15:42:58
|
---|
|
Re: Improving search efficiency and speedI was looking for something else and came across this post. Things have moved around a bit and I couldn't find a branch or a ticket about this. We still can benefit from this approach. I'll add a new issue in github, unless someone can find the original |
2023/1/2 4:22:12
|
---|
|
Re: Improving search efficiency and speedI think it's safe to say that the ticket is gone on the mean time. It will be better to create a new one for that. I would prefer to focus this change first for 2.0,eventually later it can be back ported to 1.5.x,but it is important to give precedence to the 2.0 branch now. Search is an area where we can improve a lot, for example by making it easy on a core level to use external search providers as well, next to the standard search
|
_________________
|
2023/1/2 19:00:36
|
---|
|
Re: Improving search efficiency and speedIt would definitely be good to see 2.0 come to reality. And for us to use what we have learned for the benefit of us all. It's even better when we can incorporate what community members share with us. One of our core promises was to provide all users with a path to upgrade. It was the foundation for the development of the deprecated messages in debug. I have yet to comprehend what it will take to make a module compatible with 2.0, or if there is an upgrade from 1.3/1.4/1.5 to 2.x. Even with that commitment, 1.2 > 1.3 was a 'start over' for some webmasters (including me). There were modules that didn't make the transition and replacements need to be found (still). 1.3 > 1.4 dropped more modules because they weren't updated by their developers and they haven't been adopted by anyone else. As a founder and one of the developers I take responsibility for this. I am also a stakeholder and have sites that rely on what we produce. Don't get me wrong - what's been happening with 2.0 is going to be a great thing. I don't have the skills to have done any of it. I also lack the ability to explain it. I've gone off topic for this thread - I apologize. I'll move to another topic where we can continue the conversation. |
2023/1/3 6:09:00
|
---|
|
Re: Improving search efficiency and speedI split of this conversation to this new thread on the forums. Edited by fiammybe on 2023/1/3 9:29:10
|
_________________
|
2023/2/2 23:50:34
|
---|
|
Re: Improving search efficiency and speed@skenow did you create a ticket for this search optimisation on github? |
_________________
|
2023/2/3 15:38:56
|
---|
|
Re: Improving search efficiency and speedYes, I did - https://github.com/ImpressCMS/impresscms/issues/1368
|