Here's an illustration from CMS Watch that we use in the new AIIM Information Organization & Access (IOA) Certificate program that shows the different search subsystems and how they work together.
First is content indexing, which is created by crawling directories and websites and extracting content from databases and other repositories. This has to be done on a regular basis, so if one of those repositories is updated the search engine will have some sort of procedure that enables it to go in and source and index that updated content.
So once it gathers all that content, as I mentioned, it creates an index. That is a searchable index of all the content. And oftentimes, there's other value added processing, such as metadata extraction, and also auto-summarization. What exactly does that mean?
Well, many search tools will actually take the collection and group documents together into some sort of category. That in turn could be searched on a user could get the results based on how the particular search engine has categorized it. So once this index exists, there can be the acceptance of queries. So a searcher will then type in some sort of query as to what they're looking for. And query is essentially not necessarily in question form; it's just a term or whatever you're looking for, typed into the search box.
And then there's an engine that processes this query. The query passes over the index, finds the documents that match that particular term or subject, and then it returns those documents and it goes through some sort of processor. The processor will sort the documents by various items, so, relevance or it will cluster the documents based on the categorization, or some other logic. If you have best bets or recommended best content, whatever it might be. That's really up to you how you want to process them once that query returns the content.
Then of course lastly, we have the formatting. And that's the results page that you're used to seeing. It formats the results, usually in some sort of template. And there, you also have a lot of flexibility as to how you'd like to see it presented. Now, every single step along this process, all of these subsystems can be tweaked to accommodate your particular information organization and access needs. The part at the top around content indexing, that's where you're going to be particularly occupied with your information organization. And how your content is organized is going to effect how well the search tool can go through the collection and create that index.
The second part is customizing your access experience, if you wish to do so. The search tool will allow you to specify what kinds of queries you want to accept, what kinds of documents you want to return based on those queries, and then you have lots of options as to how you want them processed and how you want them presented.
There is a growing recognition in the industry that what matters is not how searchable you make your information, but how findable. The emphasis here lies less on the latest algorithm, and more on the success of your information management regimen and your capacity to incorporate an effective user experience into the search process.
Search is not just search anymore, and the analyst company Gartner has in recent years been using the term "Information access technology" to include and expand on what they previously called "enterprise search technology". They use the term information access to include a collection of technologies to help you find information, such as;
- enterprise search;
- content classification, categorization and clustering;
- fact and entity extraction;
- taxonomy creation and management;
- information presentation (for example visualization).
This is a useful expansion of the problem set, but we should keep in mind that many of the tools around extraction, classification, and categorization remain supplementary to the essential professional task of organizing information.
AIIM has therefore introduced an Information Organization and Access (IOA) Certificate program that covers global best practices for organizing information for improved access. Courses are available online or classroom, and please visit www.aiim.org/training for course objectives and agenda
The information provided in this page is courtesy of AIIM ®. Please click here for more information on Enterprise Search.