MoonPoint Support Logo

 

Shop Amazon Warehouse Deals - Deep Discounts on Open-box and Used ProductsAmazon Warehouse Deals



Advanced Search
November
Sun Mon Tue Wed Thu Fri Sat
         
23
24 25 26 27 28 29 30
2024
Months
NovDec


Wed, May 16, 2007 9:57 pm

htDig Invalid Comptype

I ran ht://Dig to index the site today using the command /usr/bin/rundig -c /etc/htdig_support.conf >>/var/log/htdig 2>&1, but when I performed htdig searches of the site after the indexing process completed, which took a considerable amount of time, none of the searches returned any results. When I checked the output file for the rundig command, /var/log/htdig, I saw the errors below:

# cat /var/log/htdig
FATAL ERROR:Compressor::get_vals invalid comptype
FATAL ERROR at file:WordBitCompress.cc line:827 !!!
/usr/bin/rundig: line 36: 23767 Segmentation fault      $BINDIR/htdig -i $opts $
stats $alt
/usr/bin/rundig: line 81: 24766 Segmentation fault      /usr/bin/htfuzzy $opts m
etaphone
/usr/bin/rundig: line 82: 24767 Segmentation fault      /usr/bin/htfuzzy $opts s
oundex
I found some references to others encountering the same error message when I performed a Google search, but didn't see anything that I felt would give me an appropriate fix for my system. Some of the references seemed to indicate the problem occurred when htdig was indexing an enormous number of files. But there are only a few hundred files for it to index on my site, so I didn't think the number of files should be the cause of the problem. However, htdig had been indexing pages in my Blosxom blog several times, because of my use of the Find plugin for Blosxom.

I included a search feature on each page of the blog that uses Fletcher Penney's find plugin to allow a search of the blog for information. Underneath the search box there is an "Advanced Search" link that provides more advanced search capabilities. Clicking on it will display the same blog page as was visible before, but with advanced search options visible. This was resulting in ht://Dig returning the same page multiple times whenever I used it to search the entire site (the Find plugin only searches the blog while I have htdig search the entire site).

I thought I might reduce the extraneous results for htdig queries, reduce the time to index the site when running rundig, and possibly elimininate the "FATAL ERROR:Compressor::get_vals invalid comptype" error message by having htdig exclude the "Advanced Search" links when indexing the site. Since that link on pages always includes "advanced_search=1" in the link URL, I edited the htdig configuration file for the website, which is /etc/htdig_support.conf in this case, and added "advanced_search=1" to the exclude_urls list. So I now have the following line in that conf file (the "/cgi-bin/ .cgi" was there by default):

exclude_urls:           /cgi-bin/ .cgi advanced_search=1

I also added some file extensions to the list of filetypes htdig should exclude from its indexing process. I added ".mp3 .img .iso .dat .dll .scr" to the bad_extensions section, so I now have the following in that list:


bad_extensions:         .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \
        .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi .css \
        .cab .png .rar .mp3 .img .iso .dat .dll .scr

There is no need for htdig to index binary files. It will only take more time for htdig to index the site if they aren't excluded and greatly increase the changes htdig will fail while indexing the site. If you store other types of music or movie files on a site, you should add them to the bad_extensions list, if you use htdig.

When I reran rundig with the command /usr/bin/rundig -c /etc/htdig_support.conf >/var/log/htdig 2>&1, it did not fail this time and when I performed htdig searches of the site, I didn't get results returned that were duplicates due to the Blosxom Find plugin's "Advanced Search" links.

References:

  1. RE: [htdig] Segfault indexing a site with 3.2.0b2
    May 23 2000
    ht://Dig 3.x list archive

  2. Error in zlib Compressor for WordDB
    July 30, 2002
    web.htdig.devel

  3. FindPlugin
    Author: Fletcher T. Penney

[/network/web/tools/search] permanent link

Mon, Jul 25, 2005 8:50 pm

ht://Dig Setup

I installed ht://Dig because I thought I had placed certain information on my website, which I waned to refer to again, but couldn't locate it. I have a search tool for the blog, but that will only search the blog's content. Since I couldn't find the information with that tool, I thought I might have placed the information in a file or files that weren't part of the blog's entries. So I installed htdig and used it to search the entire site. I still couldn't find the information, though I can recall creating a webpage with the information.

Oh well, I'll just have to keep looking or figure out how to do what I need to do again. One of the reasons I created the blog was to serve as a reference when my memory fails me on how I resolved a problem in the past. But, if I didn't post the information here, it's going to take me much longer to locate it or figure out again the steps I took previously.

So I won't have that problem with installing htdig again, I've posted my notes in the blog. Hopefully, it might help someone else as well in resolving problems or answering questions about setting it up so it can be used with multiple websites on the same server.

[ More Info ]

[/network/web/tools/search] permanent link

Valid HTML 4.01 Transitional

Privacy Policy   Contact

Blosxom logo