[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[sup-devel] [PATCH] switch default index to Xapian



Previous versions didn't add an :index entry in config.yaml, so preserve
compatibility by using Ferret if no index is specified and the ferret directory
exists.
---
This patch is meant for 0.10.

AFAIK the xapian index has feature-parity with ferret. There are a couple of
issues remaining with queries:

Names are stemmed and otherwise munged for convenient searching by
Xapian::TermGenerator, while email addresses are stored verbatim.
Xapian::QueryParser needs to do the same alterations to search terms, so the
parser uses separate from_{name,email} fields. This is not user-friendly but
could be worked around by having parse_query insert an OR over both fields
where it sees a from: prefix (same for to).

A more pernicious issue is that QueryParser defaults to AND if there isn't an
explicit operator (which is what we want), but if there are multiple boolean
(label/email) terms over the same field it will OR them. So, "label:sup
label:patch" will result in the union instead of the intersection. Assuming we
don't want to write our own query parser, this needs to be made configurable in
Xapian. I took a stab at it a few months ago but didn't get anywhere.

There's also the issue of long delays when flushing the index to disk on exit.
One option is to keep the delay and log an info message saying what's going on.
A second option is to set the XAPIAN_FLUSH_THRESHOLD environment variable to
something low in bin/sup, which will limit the final delay but potentially
cause short delays during normal use. A third option is to detect when the user
has been idle for a while and flush the index then.

We can easily fix the first and third issues before 0.10. Are there any others
I've forgotten?

 lib/sup.rb       |    5 +++--
 lib/sup/index.rb |    2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/sup.rb b/lib/sup.rb
index 144f5e3..fa19de2 100644
--- a/lib/sup.rb
+++ b/lib/sup.rb
@@ -54,7 +54,7 @@ module Redwood
   YAML_DOMAIN = "masanjin.net"
   YAML_DATE = "2006-10-01"
 
-  DEFAULT_INDEX = 'ferret'
+  DEFAULT_INDEX = 'xapian'
 
   ## record exceptions thrown in threads nicely
   @exceptions = []
@@ -229,7 +229,8 @@ else
     :confirm_top_posting => true,
     :discard_snippets_from_encrypted_messages => false,
     :default_attachment_save_dir => "",
-    :sent_source => "sup://sent"
+    :sent_source => "sup://sent",
+    :index => Redwood::DEFAULT_INDEX,
   }
   begin
     FileUtils.mkdir_p Redwood::BASE_DIR
diff --git a/lib/sup/index.rb b/lib/sup/index.rb
index 87d8d52..cc78292 100644
--- a/lib/sup/index.rb
+++ b/lib/sup/index.rb
@@ -174,7 +174,7 @@ class BaseIndex
   end
 end
 
-index_name = ENV['SUP_INDEX'] || $config[:index] || DEFAULT_INDEX
+index_name = ENV['SUP_INDEX'] || $config[:index] || (File.exists?(File.join(BASE_DIR, 'ferret')) ? 'ferret' : DEFAULT_INDEX)
 case index_name
   when "xapian"; require "sup/xapian_index"
   when "ferret"; require "sup/ferret_index"
-- 
1.6.3.3

_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel