What Is Enterprise Search
Enterprise search is the application of parsing, indexing and querying technologies to build a "search engine" which allows retrieval of information from within a given enterprise. Enterprise Search engines typically index content from internal Document Management Systems, file shares, blog servers, wikis, email, Intranets and other content repositories. More powerful Enterprise Search systems incorporate both structured and unstructured data, including semantic markup, and may include social-search, semantic-search, faceted search, visual navigation and other advanced features. An Enterprise Search system typically consists of several components, including:
- Content ingestion / crawling
- Content processing and analysis
- Query parsing
Content ingestion / crawling - This is the part of the system which deals with finding content and pulling it into the system. These components are often visualized as "crawlers" that crawl the web of links between discrete pieces of content. Crawlers retrieve documents and other content over protocols like HTTP, and/or use adapters to connect to specialized repositories like relational databases or document management systems.
Content processing & analysis - This part of the system parses the content and separates text into tokens, identifies sentences and does "part of speech" tagging, and may also use machine learning algorithms and/or NLP to perform semantic "concept extraction" and Named Entity Recognition.
Indexing - Once the content has been processed, an index is built which allows for rapid retrieval of exactly those items which match a particular query. For text searching this most often takes the form of an inverted text index.
Query Parsing - Enterprise search applications may allow very general "free form" keyword searching, or may support specialized query syntax to allow more specific queries. At the most advanced end, an Enterprise Search engine may support a standardized query language like SQL or SPARQL which allows for highly targeted queries against structured data. In either case, the query parser converts the query into a representation which can be used, along with the index, to determine matching results.
Matching - The matcher is the heart of the Enterprise Search application, as it determines exactly which pieces of content match the requested query, and returns a representation of that content.
Enterprise Search applications may also include the ability to parse, index and retrieve non-textual data, such as audio and video files, and images.
Why Should I Care About Enterprise Search
In interviews with customers, we have consistently found that most enterprises have either no search capability, or an inadequate search capability. Knowledge workers waste a tremendous amount of time looking for information, sorting through poor search results, or jumping from system to system to run multiple searches. In the worst case, the correct information is never located at all, which leads to poor decision making or missed opportunities.
How Can Enterprise Search Benefit My Organization
A high-quality Enterprise Search engine will return highly relevant results featuring content from all of an organization's varied repositories of content. This makes it faster and easier for knowledge workers to locate the content throughout their workday. More advanced search applications will also feature recommendation engines which help users find additional content similar to a given piece of content, and other tools such as visual navigation and faceted search which support information discovery. High quality search and information discovery are fundamental capabilities for enabling knowledge transfer within an organization, and for allowing an organization to become a "learning organization".
How Do I Implement Enterprise Search
There are many Enterprise Search products available, including both Open Source solutions like Fogbeam Labs' Heceta, Apache Solr, ht://dig, and others, as well as proprietary offerings from HP, IBM and others.
Open Source solutions offer greater value, and maximize the flexibility of the resulting system. Fogbeam Labs' Heceta is an advanced Open Source Enterprise Search offering, based on Apache Solr and Lucene, which incorporates both social-search and semantic-search to provide the powerful search capabilities which are demanded by today's fast-paced, information intensive enterprises.
For more information on all of our enterprise knowledge management, collaboration, and search products, please contact us today.
Content on this page is based, in part, on the corresponding Wikipedia entry.