Sunday, May 4, 2008

Verity K2 and indexing files

For whatever reason it seems that trying to index a fairly large number of Excel files with Verity K2 (CFMX 7.2) seems to take forever. CPU is super-busy and the collection size is not really changed that much even after waiting for more then an hour. On the other hand, indexing smaller directories is a lot faster - 200 small Excel files are done in about 2 minutes or so (collection is still well below 100,000 documents).

Considering how fast database records were indexed (I had about 2-3 minutes per 10,000 database rows) it just seemed like something was not right. Anyways, chunking those 20,000+ Excel files into sub-directories led to below 30 minute total indexing time. Tip of the day: don't try to index such a large number of excel sheets at one go, instead go with smaller directories.

No comments: