Indexed and searchable Document Types

WP-Filebase Pro indexes supported documents and enable search of the content. It’ll scan the whole file content in stable and efficient way, so any keyword on any document page will get recognized.

TypeExtensionNotes
Portable Document Format.PDFRequires Ghostscript with the txtwrite device (included in version 9.05 or higher)
Microsoft Word Document.DOCX
Microsoft Excel Document.XLSX
Microsoft Powerpoint Document.PPTX, .PPTMFull text index of all slides
Microsoft Word Document.ODT
ZIP Archive.ZIPSearches file names inside archive
Extensible Markup Language.XMLFull text index of tag content
Comma-separated values.CSVFull text index

The search index is updated on rescan. You can see the extracted keyword below the form when editing a file.

9 thoughts on “Indexed and searchable Document Types

  1. JM says:
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)

    Hi,
    thanks for the job done.
    Is WPFP indexing the “content” of documents also, including pdf files? (Or simply Titles, tags, etc.)
    Is he relying on the traditionnal WP Search engine or bringing WPFP improvement to it?

    Best regards.

    1. Fabian says:
      VN:F [1.9.22_1171]
      Rating: +1 (from 1 vote)

      Hi,
      yes, WP-Filebase scans the whole file. Its like converting the PDF file to a plain text file and then extracting all the keywords from this text. It’ll find any keyword on any page.
      However, it does not show any details about the context or page number of the match occurence, like Google results does (e.g. the whole sentence).

  2. Jeremy says:
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)

    Is there a specific way that PDFs need to be processed in order to be searchable? I’ve noticed (with the Pro version) I can search for a text string and some PDFs will be returned in the search results but others will not despite having that text string. They’re all text PDFs (not images)…

    1. Fabian says:
      VN:F [1.9.22_1171]
      Rating: 0 (from 0 votes)

      You need at least Ghostscript 9.05, I added this to the table above.

  3. Kjie says:
    VA:F [1.9.22_1171]
    Rating: +1 (from 1 vote)

    Hi,

    I have the PRO version installed. When I search for any text which can be found in a .pdf file, it won’t show up. Am I doing something wrong?

    Kjie

  4. Steffen says:
    VA:F [1.9.22_1171]
    Rating: -1 (from 1 vote)

    What does “The search index is updated on rescan” mean?
    Can I trigger a rescan?
    Ghostscript 9.06 is installed, the link to the executable is saved in settings without error message but still neither thumbnails nor full-text indexing works.
    ?

    1. Steffen says:
      VA:F [1.9.22_1171]
      Rating: 0 (from 0 votes)

      I found the rescan. Thumbnails work but full-text index of digital, unprotected pdfs still does not work.

  5. Steffen says:
    VA:F [1.9.22_1171]
    Rating: +1 (from 1 vote)

    What is wrong when the WP-Filebase Pro Search Widget finds only PDFs with the search term in the filename not in the PDF-content?

  6. R_ashwin says:
    VA:F [1.9.22_1171]
    Rating: 0 (from 2 votes)

    Here is the one and only most entertaining card game for free and online.Click on this link http://solitairetimes.com/addiction/ and play for free and online.Like me you also love to play and share this with your friends and to the others.Thank you so much for this exciting card game.

Leave a Reply to JM Cancel reply

Your email address will not be published. Required fields are marked *