Current time: 01-28-2021, 07:47 PM Hello There, Guest! (LoginRegister)

Post Reply 
 
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Database Query Search Time
08-26-2010, 08:29 PM (This post was last modified: 08-26-2010 08:30 PM by green-watch.org.)
Post: #47
RE: Database Query Search Time
Hey Guys,

Revisiting a few issues I have talked about.

#1 - Price Ranges - instead of storing price ranges, you can store the actual prices. However, I found it to be important that all prices have the same amount of characters even if the data field is stored as type double. In order to do this, I get the string length of the price and then add leading zeros accordingly depending on what the length is. Search time you can do something like this to get prices between $98 dollars and $108 dollars: custom2:[000000098.00 TO 000000108.00]

#2 - Boolean String Field - Remember how I said I am storing a boolean system in my custom4 field. Take the number "010110050100" for example. By the way, this field is being stored as an integer data type now.

0 - Leading 0 needed for wildcard system to work
1 - Product Validated (0 or 1)
0 - Has External Photo (0 or 1)
1 - Has Internal Photo (0 or 1)
1 - Business Validated (0 or 1)
0 - Error Viewing Photo (0 or 1)
0 - Is Featured Product (0 or 1)
5 - Business Type ID (1 or 2 or 3 or 4 or 5)
0 - HTML Markup Error (0 or 1)
1 - Greenness Updated (0 or 1)
0 - Short Product Description (0 or 1)
0 - Ending 0 needed for wildcard system to work

At search time, if I want to find products that are not validated, I can do something like this: custom4:00?????????0

At search time, if I want to find products that are not validated and have a short product description, I can do this: custom4:00????????10

Both of these changes allow for the collection to be smaller. On another note, here is version 1.4 of my SOLR_SEARCH:

Code:
<cfsetting requestTimeOut = "60">

<cfset SOLR_SERVER_HOST = 'localhost'>
<cfset SOLR_SERVER_PORT = '8983'>
<cfset SOLR_SERVER_VERSION = '2.2'>

<cffunction name="SOLR_SEARCH" returnType="struct">

  <cfargument name="collection" type="string" required="true" default="">
  <cfargument name="criteria" type="string" required="false" default="*:*">
  <cfargument name="startrow" type="string" required="false" default="0">
  <cfargument name="maxrows" type="string" required="false" default="10">
  <cfargument name="sort" type="string" required="false" default="">
  <cfargument name="columns" type="string" required="false"
  default="category,custom1,custom2,custom3,custom4,key,score,size,summary,title,spell">
  <cfargument name="facet" type="string" required="false" default="">
  <cfargument name="timeout" type="string" required="false" default="10">
  <cfargument name="timeAllowed" type="string" required="false" default="10">
  
  <cfset returnStruct = StructNew()>
  
  <cfif NOT IsDefined("timeout") OR NOT IsNumeric(timeout) OR #timeout# LT 1>
    <cfset timeout = 10>
  </cfif>

  <cfif NOT IsDefined("timeAllowed") OR NOT IsNumeric(timeAllowed) OR #timeAllowed# LT 1>
    <cfset timeAllowed = 10>
  </cfif>

  <cfif NOT IsDefined("startrow") OR NOT IsNumeric(startrow) OR #startrow# LT 0>
    <cfset startrow = 0>
  </cfif>

  <cfif NOT IsDefined("maxrows") OR NOT IsNumeric(maxrows) OR #maxrows# LT 1>
    <cfset maxrows = 1>
  </cfif>

  <cfif NOT IsDefined("criteria")>
    <cfset criteria = "*:*">
  </cfif>

  <cfset myURL = 'http://#SOLR_SERVER_HOST#:#SOLR_SERVER_PORT#/solr/#collection#/select/'>
  <cfset myURL = '#myURL#?q=#criteria#&version=#SOLR_SERVER_VERSION#&start=#startrow#&rows=#maxrows#&indent=off'>

  <cfif IsDefined("sort") AND #sort# NEQ "">
    <cfset myURL = '#myURL#&sort=#sort#'>
  </cfif>

  <cfif IsDefined("columns") AND #columns# NEQ "">
    <cfset myURL = '#myURL#&fl=#columns#'>
  </cfif>

  <cfif IsDefined("facet") AND #facet# NEQ "">
    <cfset myURL = '#myURL#&facet=true&#facet#'>
  </cfif>  

  <cfif IsDefined("timeAllowed") AND #timeAllowed# NEQ "">
    <cfset myURL = '#myURL#&timeAllowed=#timeAllowed#'>
  </cfif>  

  <cfhttp method="Get" url="#myURL#" resolveurl="Yes" timeout="#timeout#" throwonerror="yes" />

  <cfset xmlfile=XMLParse(cfhttp.FileContent)>

  <cfset numResultsFound = #xmlfile.XmlRoot.XmlChildren[2].XmlAttributes.numFound#>
  <cfset numResultsReturned = #ArrayLen(xmlfile.XmlRoot.XmlChildren[2].XmlChildren)#>

  <cfset partialResults = 'false'>
  <cfset queryFieldList = ''>
  <cfset queryStatus = ''>
  <cfset queryTime = ''>

  <cfloop index="i" from="1" to="#ArrayLen(xmlfile.XmlRoot.XmlChildren[1].XmlChildren)#">  
    <cfif #xmlfile.XmlRoot.XmlChildren[1].XmlChildren[i].XmlAttributes.name# EQ "params">
      <cfloop index="j" from="1" to="#ArrayLen(xmlfile.XmlRoot.XmlChildren[1].XmlChildren[i].XmlChildren)#">
        <cfif #xmlfile.XmlRoot.XmlChildren[1].XmlChildren[i].XmlChildren[j].XmlAttributes.name# EQ "fl">
          <cfset queryFieldList = '#xmlfile.XmlRoot.XmlChildren[1].XmlChildren[i].XmlChildren[j].XmlText#'>
        </cfif>
      </cfloop>
    <cfelseif #xmlfile.XmlRoot.XmlChildren[1].XmlChildren[i].XmlAttributes.name# EQ "status">
      <cfset queryStatus = '#xmlfile.XmlRoot.XmlChildren[1].XmlChildren[i].XmlText#'>
    <cfelseif #xmlfile.XmlRoot.XmlChildren[1].XmlChildren[i].XmlAttributes.name# EQ "QTime">
      <cfset queryTime = '#xmlfile.XmlRoot.XmlChildren[1].XmlChildren[i].XmlText#'>
    <cfelseif #xmlfile.XmlRoot.XmlChildren[1].XmlChildren[i].XmlAttributes.name# EQ "partialResults">
      <cfset partialResults = '#xmlfile.XmlRoot.XmlChildren[1].XmlChildren[i].XmlText#'>      
    </cfif>      
  </cfloop>

  <cfif #queryFieldList# NEQ "">    
      
    <cfset xmlqry = QueryNew("#queryFieldList#")>  
        
    <cfif #numResultsReturned# NEQ 0>

      <cfset xmlqryRows = QueryAddRow(xmlqry,numResultsReturned)>

      <cfloop index="i" from="1" to="#numResultsReturned#">
          
        <cfif #ArrayLen(xmlfile.XmlRoot.XmlChildren[2].XmlChildren[i].XmlChildren)# NEQ 0>    
          
          <cfloop index="j" from="1" to="#ArrayLen(xmlfile.XmlRoot.XmlChildren[2].XmlChildren[i].XmlChildren)#">
            
            <cfloop list="#queryFieldList#" index="k">
            
              <cfset myElement = '#xmlfile.XmlRoot.XmlChildren[2].XmlChildren[i].XmlChildren[j].XmlAttributes#'>
              
                 <cfif #ArrayLen(StructFindValue(myElement,k))# NEQ 0>

                <cfif #k# EQ "spell">
                  <cfset myElement = '#xmlfile.XmlRoot.XmlChildren[2].XmlChildren[i].XmlChildren[j].XmlChildren[1].XmlText#'>
                  <cfset xmlqryCell = QuerySetCell(xmlqry,"#k#",myElement,"#i#")>              
                <cfelseif #k# NEQ "category">
                  <cfset myElement = '#xmlfile.XmlRoot.XmlChildren[2].XmlChildren[i].XmlChildren[j].XmlText#'>
                  <cfset xmlqryCell = QuerySetCell(xmlqry,"#k#",myElement,"#i#")>
                <cfelse>
                  <cfset catList = '#xmlfile.XmlRoot.XmlChildren[2].XmlChildren[i].XmlChildren[j].XmlChildren[1].XmlText#'>
                  <cfset xmlqryCell = QuerySetCell(xmlqry,"#k#",catList,"#i#")>
                </cfif>
              
              </cfif>        
      
            </cfloop>
      
          </cfloop>
        
        </cfif>
    
      </cfloop>
      
    </cfif>
  
    <cfset xmlqry2 = QueryNew("RESULTS_FOUND,QUERY_STATUS,QUERY_TIME,RESULTS_RETURNED,PARTIAL_RESULTS")>
    <cfset xmlqryRows = QueryAddRow(xmlqry2,1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"RESULTS_FOUND","#numResultsFound#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"QUERY_STATUS","#queryStatus#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"QUERY_TIME","#queryTime#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"RESULTS_RETURNED","#numResultsReturned#",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry2,"PARTIAL_RESULTS","#partialResults#",1)>

    <cfset xmlqry3 = QueryNew("QUERY_TIMEOUT,NULL_FIELDLIST")>
    <cfset xmlqryRows = QueryAddRow(xmlqry3,1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry3,"QUERY_TIMEOUT","0",1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry3,"NULL_FIELDLIST","0",1)>
    
    <cfset returnStruct[0] = xmlqry3>
    <cfset returnStruct[1] = xmlqry2>
    <cfset returnStruct[2] = xmlqry>
    
  <cfelse>

    <cfset xmlqry3 = QueryNew("QUERY_TIMEOUT,NULL_FIELDLIST")>
    <cfset xmlqryRows = QueryAddRow(xmlqry3,1)>
    <cfset xmlqryCell = QuerySetCell(xmlqry3,"QUERY_TIMEOUT","0",1)>    
    <cfset xmlqryCell = QuerySetCell(xmlqry3,"NULL_FIELDLIST","1",1)>
    <cfset returnStruct[0] = xmlqry3>
        
  </cfif>
  
  <cfreturn returnStruct>
  
</cffunction>

<cfparam name="URL.SEARCH_FOR" default="" type="string">

<cfset args = StructNew()>
<cfset args.collection = "products_id2">
<cfset args.startrow = 0>
<cfset args.maxrows = 10>
<cfset args.columns = 'key,custom1,custom2,custom3,custom4,summary,title,spell'>
<cfset args.sort = 'score+desc'>
<cfset args.criteria = '#URL.SEARCH_FOR#'>
<cfset args.timeout = 5>
<cfset args.timeAllowed = 3000>

<cfinvoke method="SOLR_SEARCH" argumentCollection="#args#" returnVariable="res">

<cfdump var="#res#">

When you submit a document to SOLR, it stores a short version of the description in the summary field and stores the entire version in some spell field.

Now that I am returning the entire product description, price, product id, and product name, I do not have to query my database after I retrieve the key value. This makes an even faster search yet. Time to reindex yet again (:

Sincerely,
Travis Walters
Find all posts by this user
Quote this message in a reply
Post Reply 


Messages In This Thread
RE: Database Query Search Time - pmeenan - 07-04-2010, 12:03 PM
RE: Database Query Search Time - pmeenan - 07-05-2010, 04:35 AM
RE: Database Query Search Time - sajal - 07-05-2010, 05:03 AM
RE: Database Query Search Time - pmeenan - 07-08-2010, 08:47 PM
RE: Database Query Search Time - sajal - 08-13-2010, 10:06 PM
RE: Database Query Search Time - sajal - 08-14-2010, 03:38 AM
RE: Database Query Search Time - sajal - 08-14-2010, 06:50 AM
RE: Database Query Search Time - pmeenan - 08-16-2010, 02:55 AM
RE: Database Query Search Time - sajal - 08-17-2010, 01:14 AM
RE: Database Query Search Time - pmeenan - 08-23-2010, 01:22 AM
RE: Database Query Search Time - pmeenan - 08-23-2010, 08:45 PM
RE: Database Query Search Time - ringman57 - 08-24-2010, 11:52 PM
RE: Database Query Search Time - ringman57 - 08-25-2010, 02:54 AM
RE: Database Query Search Time - green-watch.org - 08-26-2010 08:29 PM

Forum Jump:


User(s) browsing this thread: 1 Guest(s)