Web Wiz - Green Windows Web Hosting

  New Posts New Posts RSS Feed - Searchable Attachments
  FAQ FAQ  Forum Search   Events   Register Register  Login Login

Searchable Attachments

 Post Reply Post Reply
Author
mrvogler1 View Drop Down
Newbie
Newbie


Joined: 03 June 2008
Status: Offline
Points: 4
Post Options Post Options   Thanks (0) Thanks(0)   Quote mrvogler1 Quote  Post ReplyReply Direct Link To This Post Topic: Searchable Attachments
    Posted: 03 June 2008 at 1:28pm

Does your software support the ability to search uploaded attachments? I want to upload many files (.pdf, .xls, .doc) so they can become a knowledge base for searches as well as the posts.

Back to Top
WebWiz-Bruce View Drop Down
Admin Group
Admin Group
Avatar
Web Wiz Developer

Joined: 03 September 2001
Location: Bournemouth
Status: Offline
Points: 9844
Post Options Post Options   Thanks (0) Thanks(0)   Quote WebWiz-Bruce Quote  Post ReplyReply Direct Link To This Post Posted: 03 June 2008 at 1:43pm
it would be very difficult to search such files as the data within them is not saved as plain text but instead binary.

I've not seen anything on the web that would allow such documents to be searched. You may find a component to be able to search these types of documents but I doubt it would be cheap.
Back to Top
mrvogler1 View Drop Down
Newbie
Newbie


Joined: 03 June 2008
Status: Offline
Points: 4
Post Options Post Options   Thanks (0) Thanks(0)   Quote mrvogler1 Quote  Post ReplyReply Direct Link To This Post Posted: 03 June 2008 at 2:24pm

The file types I have mentioned are text based files. Indexing tools provided by Windows and SQL Server does make document files indexable and thereby searchable. I'm not sure how your software indexes the posted text, but it would seem you might try to support this in the future.

Back to Top
WebWiz-Bruce View Drop Down
Admin Group
Admin Group
Avatar
Web Wiz Developer

Joined: 03 September 2001
Location: Bournemouth
Status: Offline
Points: 9844
Post Options Post Options   Thanks (0) Thanks(0)   Quote WebWiz-Bruce Quote  Post ReplyReply Direct Link To This Post Posted: 03 June 2008 at 6:21pm
Sorry but I beg to differ, try opening a PDF, DOC, or XLS file in a text editor, it would just be a jumble with no plain text. Below is what you would see if you opened a Word doc in a text editor, as you will see it's not plain text:-

»Ó$˜ýÃîTÛ·w-ˆj҃ǝùæ›ß|ìzs°ƒzç˜zï*X%(vÚ›Þµ¼6O‹{PIȼã
Žœ`S__­_x ÉC©ëCRÙÅ¥
:‘ð€˜tÇ–Rá»ÜÙùhIò3¶H¿Q˸*Ë;Œ¿= yª­© nÍ
¨æòæyo¿Ûõš½Þ[vrfòAØ6‹3[”>_£Š-KÆëç\NH!ð<Ñêr¢¿¯EËB†„PûÈÓ<_Š) åå@ó?é|øh0GtÊvŠæö?iô>‰·3ñœ4ßH8ú˜õ'   ÿÿ


Attachments/Uploads are not stored in the database so you could not use SQL server to search them.

Windows may be able to search word documents etc. but web applications don't work like a desktop application, they don't have access to many things that you see running on your desktop.

Yes it would be nice to be able to search Word Documents and PDF's and trust me this has been looked into before and a great deal of time spent on it, but it is simply not possible with just ASP.

As I mentioned earlier you may find a component that does allow ASP to do this, but we have not come across one yet that does.

If you do find a component that allows ASP to search PDF's, Word Doc's etc. let me know, I would love to find one that does do this. You may then see support for it in future releases, but at the current time I have not seen any.


Edited by WebWiz-Bruce - 03 June 2008 at 6:34pm
Back to Top
mrvogler1 View Drop Down
Newbie
Newbie


Joined: 03 June 2008
Status: Offline
Points: 4
Post Options Post Options   Thanks (0) Thanks(0)   Quote mrvogler1 Quote  Post ReplyReply Direct Link To This Post Posted: 03 June 2008 at 8:18pm
I am a .NET programmer and am familiar with some of the indexing techniques.
 
I agree a .pdf, .doc and .xls contains binary characters and thus are not readable by just opening the file in a text editor. When I said they were textual, I meant that if properly parsed the text content can be indexed and thereby made searchable vs. an image file that may have text as part of the image but isn't really indexable because the it is just pixels.
 
Possible solutions for you to consider.
 
1. If the attachments were stored in the SQL Server database in a binary column the table could be set to "full-text" indexing within SQL Server. A document type (.pdf, .doc, .xls) is also specified so SQL SERVER knows how to parse it for indexing. This is available in both SQL Server 2000 and SQL Server 2005.
 
2. Microsoft Indexing Service - can be leveraged to index files stored in the file system. Very easy to use. Just drop files in a directory and the indexing service takes it from there.
 
 
Both solutions above support making SQL like queries to the index sources to find results.
 
You and or your team may have decided not to use these tools/techniques to build the indexes on attachments but there certainly are solutions out there.
Back to Top
WebWiz-Bruce View Drop Down
Admin Group
Admin Group
Avatar
Web Wiz Developer

Joined: 03 September 2001
Location: Bournemouth
Status: Offline
Points: 9844
Post Options Post Options   Thanks (0) Thanks(0)   Quote WebWiz-Bruce Quote  Post ReplyReply Direct Link To This Post Posted: 04 June 2008 at 9:34am
Would rather not store attachments in the database as this generally bloats out the database, also the present system allows you to manage your uploaded files and images on the server, which would be more tricky it attachments were stored in the database.

The problem with using the Index service is that it is very rarely enabled on web servers, especially shared web servers were the software is most in use. I've also not seen away to be able to use the Microsoft Indexing Service with classic ASP, but if you do know how please let me know.
Back to Top
wb-in-wpb View Drop Down
Newbie
Newbie


Joined: 16 April 2008
Status: Offline
Points: 17
Post Options Post Options   Thanks (0) Thanks(0)   Quote wb-in-wpb Quote  Post ReplyReply Direct Link To This Post Posted: 04 June 2008 at 5:35pm
Is there something that could be done with a Google API? I know Google returns searches based on text found inside pdf files and other binary files.
 
Maybe it is not something that could (or should) be integrated into the forum software but it could solve the original posters problem. It may be that he will need 2 search links, 1 for posts and one for files/attachments.
 
Here is the direct link to their custom search engine -- http://www.google.com/coop/cse/
The API and open source code is here -- http://code.google.com/
 
Back to Top
 Post Reply Post Reply

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 12.08
Copyright ©2001-2026 Web Wiz Ltd.


Become a Fan on Facebook Follow us on X Connect with us on LinkedIn Web Wiz Blogs
About Web Wiz | Contact Web Wiz | Terms & Conditions | Cookies | Privacy Notice

Web Wiz is the trading name of Web Wiz Ltd. Company registration No. 05977755. Registered in England and Wales.
Registered office: Web Wiz Ltd, Unit 18, The Glenmore Centre, Fancy Road, Poole, Dorset, BH12 4FB, UK.

Prices exclude VAT at 20% unless otherwise stated. VAT No. GB988999105 - $, € prices shown as a guideline only.

Copyright ©2001-2026 Web Wiz Ltd. All rights reserved.