[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[sup-devel] [PATCH] add sup-import-dump: import message state in sup-dump format



sup-import-dump imports message state as exported by sup-dump into the index.
It is a direct replacement for the sup-sync --restored functionality that got
lost when merging the maildir branch.
Unlike sup-sync it operates on the index only, so it's fast enough for
periodically importing full dumps to keep multiple sup installations
synchronised.
It should also be easy enough to add support for a "diff" style format that
would allow replaying "logs" if sup were enhanced to write those in the
future.

To give some rough numbers:

Dump file contains 78104 lines, index about 600k entries. 410 entries from the
dump file don't match the index and cause index updates. Transaction mode is
used for all runs.
Cold cache, dry run: 138s real time, 53s user+system
Hot cache, dry run: 42s real time, 40s user+system
Hot cache, changes written to disk: 55s real time, 44s user+system
Hot cache, no updates: 43s real time, 41s user+system

Signed-off-by: Sascha Silbe <sascha-pgp@silbe.org>
---
 bin/sup-import-dump |   99 +++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/sup/index.rb    |   15 ++++++++
 2 files changed, 114 insertions(+), 0 deletions(-)

diff --git a/bin/sup-import-dump b/bin/sup-import-dump
new file mode 100644
index 0000000..91a1721
--- /dev/null
+++ b/bin/sup-import-dump
@@ -0,0 +1,99 @@
+#!/usr/bin/env ruby
+
+require 'uri'
+require 'rubygems'
+require 'trollop'
+require "sup"; Redwood::check_library_version_against "git"
+
+PROGRESS_UPDATE_INTERVAL = 15 # seconds
+
+class AbortExecution < SystemExit
+end
+
+opts = Trollop::options do
+  version "sup-import-dump (sup #{Redwood::VERSION})"
+  banner <<EOS
+Imports message state previously exported by sup-dump into the index.
+sup-import-dump operates on the index only, so the messages must have already
+been added using sup-sync. If you need to recreate the index, see sup-sync
+--restore <filename> instead.
+
+Messages not mentioned in the dump file will not be modified.
+
+Usage:
+  sup-import-dump [options] <dump file>
+
+Options:
+EOS
+  opt :verbose, "Print message ids as they're processed."
+  opt :ignore_missing, "Silently skip over messages that are not in the index."
+  opt :warn_missing, "Warn about messages that are not in the index, but continue."
+  opt :abort_missing, "Abort on encountering messages that are not in the index. (default)"
+  opt :atomic, "Use transaction to apply all changes atomically."
+  opt :dry_run, "Don't actually modify the index. Probably only useful with --verbose.", :short => "-n"
+  opt :version, "Show version information", :short => :none
+
+  conflicts :ignore_missing, :warn_missing, :abort_missing
+end
+Trollop::die "No dump file given" if ARGV.empty?
+Trollop::die "Extra arguments given" if ARGV.length > 1
+dump_name = ARGV.shift
+missing_action = [:ignore_missing, :warn_missing, :abort_missing].find { |x| opts[x] } || :abort_missing
+
+Redwood::start
+index = Redwood::Index.init
+
+index.lock_interactively or exit
+begin
+  num_read = 0
+  num_changed = 0
+  index.load
+  index.begin_transaction if opts[:atomic]
+
+  IO.foreach dump_name do |l|
+    l =~ /^(\S+) \((.*?)\)$/ or raise "Can't read dump line: #{l.inspect}"
+    mid, labels = $1, $2
+    num_read += 1
+
+    unless index.contains_id? mid
+      if missing_action == :abort_missing
+        $stderr.puts "Message #{mid} not found in index, aborting."
+        raise AbortExecution, 10
+      elsif missing_action == :warn_missing
+        $stderr.puts "Message #{mid} not found in index, skipping."
+      end
+
+      next
+    end
+
+    m = index.build_message mid
+    new_labels = labels.to_set_of_symbols
+
+    if m.labels == new_labels
+      puts "#{mid} unchanged" if opts[:verbose]
+      next
+    end
+
+    puts "Changing flags for #{mid} from '#{m.labels.to_a * ' '}' to '#{new_labels.to_a * ' '}'" if opts[:verbose]
+    num_changed += 1
+
+    next if opts[:dry_run]
+
+    m.labels = new_labels
+    index.update_message_state m
+  end
+
+  index.commit_transaction if opts[:atomic]
+  puts "Updated #{num_changed} of #{num_read} messages."
+rescue AbortExecution
+  index.cancel_transaction if opts[:atomic]
+  raise
+rescue Exception => e
+  index.cancel_transaction if opts[:atomic]
+  File.open("sup-exception-log.txt", "w") { |f| f.puts e.backtrace }
+  raise
+ensure
+  index.save_index unless opts[:atomic]
+  Redwood::finish
+  index.unlock
+end
diff --git a/lib/sup/index.rb b/lib/sup/index.rb
index b90c2b1..bcc449b 100644
--- a/lib/sup/index.rb
+++ b/lib/sup/index.rb
@@ -260,6 +260,21 @@ EOS
     end
   end
 
+  # wrap all future changes inside a transaction so they're done atomically
+  def begin_transaction
+    synchronize { @xapian.begin_transaction }
+  end
+
+  # complete the transaction and write all previous changes to disk
+  def commit_transaction
+    synchronize { @xapian.commit_transaction }
+  end
+
+  # abort the transaction and revert all changes made since begin_transaction
+  def cancel_transaction
+    synchronize { @xapian.cancel_transaction }
+  end
+
   ## xapian-compact takes too long, so this is a no-op
   ## until we think of something better
   def optimize
-- 
1.7.2.3

_______________________________________________
Sup-devel mailing list
Sup-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-devel