Tag: Enterprise semantic search

Embedded Search in the Enterprise

We need to make a distinction between “search in the enterprise” and “enterprise-wide search.” The former is any search that exists persistently in view as we go about our primary work activities. The latter commonly assumes aggregation of all enterprise content via a single platform OR enterprise content to which everyone in the organization will have access. So many attempts at enterprise-wide search are reported to be compromised or frustrated before achieving successful outcomes that it is time to pay attention to point-of-need solutions. This is search that will smoothly satisfy routine retrieval requirements as we work.

Most of us work in a small number of applications all day. A writer will be wedded to a content creation application plus research sources both on the web and internal to the enterprise in which writing is being done. Finding information to support writing whether it is a press release, marketing brochure or technical documentation to accompany a technical product requires access to appropriate content for the writer to deliver to an audience. The audience may be a business analyst, customer’s buyer or product user with advanced technical expertise. During any one work assignment, the writer will usually be focused on one audience and will only need a limited view of content specific to that task.

When a search takes us on a merry chase through multiple resource repositories or in a single repository with heaps of irrelevant content and no good results, we are being forced into a mental traffic nightmare, not of our own making. As this blog post by Tony Schwartz reminds us, we need time to focus and concentrate. It enables us to work smarter and more calmly; for employers seeking to support workers with the best tools, search that works well at the point of doing an assignment is the ultimate perk. I know how frantic and fractionated my mental state becomes as I follow one fruitless web of links after another that I believe will lead me to the piece of information I need. Truthfully, I often become so absorbed in the search and ancillary information I “discover” along the way that sight of the target becomes secondary.

New wisdom from a host of analysts and writers suggests that embedded search is more than a trend, as is search with a specific focus or purposeful business goal. The fact that FAST is now embedded with and for SharePoint and its use is growing principally in that arena illustrates the trend. But readers should also consider a large array of newer search solutions that are strong on semantic features, APIs, integration options, and connectors to a huge variety of content that exists in other application repositories. This article by James Martin in CIO, How to Evaluate Enterprise Search has helpful comments from Leslie Owens of Forrester Research and the rise of connectors is highlighted by Alan Pelz-Sharpe in this post.

Right now two rather new search engines are on my radar screen because of their timely entrance to the marketplace. One is Q-Sensei, which has just released their version 2.0. It is an ontology-based solution very much focused on efficiently processing big data, quick deployment, and integration with content applications. The second is Cambridge Semantics with its Anzo semantic solutions for analyzing and retrieving business data. Finally, I am very excited that ISYS was the object of an acquisition by Lexmark. It was an unexpected move but they deserved to be recognized for having solid connector/filter technology and a large, satisfied customer base. It will be interesting to see how a hardware vendor, noted for print technology, will integrate ISYS search software into its product offerings. Information retrieval belongs where work is being done.

These are just three vendors poised to change the expectations of searchers by fulfilling search needs, embedded or integrated efficiently in select business application areas. Martin White’s most recent enumeration of search vendors puts the list at about 70; they are primarily vendors with standalone search products, products that support standalone search or search engines that complement other content applications. You will see many viable options there that are unfamiliar but be sure to dig down to understand where each might fill a unique need in your enterprise.

When seeking solutions for search problems you need to really understand the purpose before seeking candidate vendors. Then focus on products that have the same clarity of applicability you want. They may be embedded with a product such as Lexmark’s, or a CAD system. The first step is to decide where and for whom you need search to be present.

Why is it so Hard to “Get” Semantics Inside the Enterprise?

Semantic Software Technologies: Landscape of High Value Applications for the Enterprise was published just over a year ago. Since then the marketplace has been increasingly active; new products emerge and discussion about what semantics might mean for the enterprise is constant. One thing that continues to strike me is the difficulty of explaining the meaning of, applications for, and context of semantic technologies.

Browsing through the topics in this excellent blog site, http://semanticweb.com , it struck me as the proverbial case of the blind men describing an elephant. A blog, any blog, is linear. While there are tools to give a blog dimension by clustering topics or presenting related information, it is difficult to understand the full relationships of any one blog post to another. Without a photographic memory, an individual does not easily connect ideas across a multi-year domain of blog entries. Semantic technologies can facilitate that process.

Those who embrace some concept of semantics are believers that search will benefit from “semantic technologies.” What is less clear is how evangelists, developers, searchers and the average technology user can coalesce around the applications that will semantically enable enterprise search.

On the Internet content that successfully drives interest, sales, opinion and individual promotion does so through a combination of expert crafting of metadata, search engine technology that “understands” the language of the inquirer and the content that can satisfy the inquiry. Good answers are reached when questions are understood first and then the right content is selected to meet expectations.

In the enterprise, the same care must be given to metadata, search engine “meaning” analysis tools and query interpretation for successful outcomes. Magic does not happen without people behind the scenes to meet these three criteria executing linguistic curation, content enhancement and computational linguistic programming.

Three recent meeting events illustrate various states of semantic development and adoption, even as the next conference, Semantic Tech & Business Conference – Washington, D.C. on November 29 – is upon us:

Event 1 – A relatively new group, the IKS-Community funded by the EU has been supporting open source software developers since 2009. In July they held a workshop in Paris just past the mid-point of their life cycle. Attendees were primarily entrepreneurs and independent open source developers seeking pathways for their semantically “tuned” content management solutions. I was asked to suggest where opportunities and needs exist in US markets. They were an enthusiastic audience and are poised to meet the tough market realities of packaging highly sophisticated software for audiences that will rarely understand how complex the stuff “under the hood” really is. My principal charge to them was to create tools that “make it really easy” to work with vocabulary management and content metadata capture, updates, and enhancements.

Event 2. – On this side of the pond, UK firm Linguamatics hosted its user group meeting in Boston in October. Having interviewed a number of their customers last year to better understand their I2E product line, I was happy to meet people I had spoken with and see the enthusiasm of a user community vested in such complex technology. Most impressive is the respectful tone and thoughtful sharing between Linguamatics principals and their customers. They share the knowledge of how hard it is to continually improve search technology that delivers answers to semantically complex questions using highly specialized language. Content contributors and inquirers are all highly educated specialists seeking answers to questions that have never been asked before. Think about it, search engines designed to deliver results for frequently asked questions or to find content on popular topics is hard enough, but finding the answer to a brand new question is a quantum leap of difficulty in comparison.

To make matters even more complicated, answers to semantic (natural language) questions may be found in internal content, in published licensed content or some combination of both. In the latter case, only the seeker may be able to put the two together to derive or infer an answer.

Publishers of content for licensing play a convoluted game of how they will license their content to enterprises for semantic indexing in combination with internal content. The Linguamatics user community is primarily in life sciences; this is one more hurdle for them to overcome to effectively leverage the vast published repositories of biological and medical literature. Rigorous pricing may be good business strategy, but research using semantic search could make more headway with more reasonable royalties that reflect the need for collaborative use across teams and partners.

Content wants to be found and knowledge requires outlets to enable innovation to flourish. In too many cases technology is impaired by lack of business resources by buyers or arcane pricing models of sellers that hold vital information captive for a well-funded few. Semantically excellent retrieval depends on an engine’s indexing access to all contextually relevant content.

Event 3. – Leslie Owens of Forrester Research, at the Fall 2011 Enterprise Search Summit conducted a very interesting interactive session that further affirms the elephant and blind men metaphor. Leslie is a champion of metadata best practices and writes about the competencies and expertise needed to make valuable content accessible. She engaged the audience with a series of questions about its wants, needs, beliefs and plans for semantic technologies. As described in an earlier paragraph about how well semantics serves us on the Web, most of the audience puts its faith in that model but is doubtful of how or when similar benefits will accrue to enterprise search. Leslie and a couple of others made the point that a lot more work has to be done on the back-end on content in the enterprise to get these high-value outcomes.

We’ll keep making the point until more adopters of semantic technologies get serious and pay attention to content, content enhancement, expert vocabulary management and metadata. If it is automatic understanding of your content that you are seeking, the vocabulary you need is one that you build out and enhance for your enterprise’s relevance. Semantic tools need to know the special language you use to give the answers you need.

Leveraging Two Decades of Computational Linguistics for Semantic Search

Over the past three months I have had the pleasure of speaking with Kathleen Dahlgren, founder of Cognition, several times. I first learned about Cognition at the Boston Infonortics Search Engines meeting in 2009. That introduction led me to a closer look several months later when researching auto-categorization software. I was impressed with the comprehensive English language semantic net they had doggedly built over a 20+ year period.
A semantic net is a map of language that explicitly defines the many relationships among words and phrases. It might be very simple to illustrate something as fundamental as a small geographical locale and all named entities within it, or as complex as the entire base language of English with every concept mapped to illustrate all the ways that any one term is related to other terms, as illustrated in this tiny subset. Dr. Dahlgren and her team are among the few companies that have created a comprehensive semantic net for English.

In 2003, Dr. Dahlgren established Cognition as a software company to commercialize its semantic net, designing software to apply it to semantic search applications. As the Gilbane Group launched its new research on Semantic Software Technologies, Cognition signed on as a study co-sponsor and we engaged in several discussions with them that rounded out their history in this new marketplace. It was illustrative of pioneering in any new software domain.

Early adopters are key contributors to any software development. It is notable that Cognition has attracted experts in fields as diverse as medical research, legal e-discovery and Web semantic search. This gives the company valuable feedback for their commercial development. In any highly technical discipline, it is challenging and exciting to finding subject experts knowledgeable enough to contribute to product evolution and Cognition is learning from client experts where the best opportunities for growth lie.

Recent interviews with Cognition executives, and those of other sponsors, gave me the opportunity to get their reactions to my conclusions about this industry. These were the more interesting thoughts that came from Cognition after they had reviewed the Gilbane report:

  • Feedback from current clients and attendees at 2010 conferences, where Dr. Dahlgren was a featured speaker, confirms escalating awareness of the field; she feels that “This is the year of Semantics.” It is catching the imagination of IT folks who understand the diverse and important business problems to which semantic technology can be applied.
  • In addition to a significant upswing in semantics applied in life sciences, publishing, law and energy, Cognition sees specific opportunities for growth in risk assessment and risk management. Using semantics to detect signals, content salience, and measures of relevance are critical where the quantity of data and textual content is too voluminous for human filtering. There is not much evidence that financial services, banking and insurance are embracing semantic technologies yet, but it could dramatically improve their business intelligence and Cognition is well positioned to give support to leverage their already tested tools.
  • Enterprise semantic search will begin to overcome the poor reputation that traditional “string search” has suffered. There is growing recognition among IT professionals that in the enterprise 80% of the queries are unique; these cannot be interpreted based on popularity or social commentary. Determining relevance or accuracy of retrieved results depends on the types of software algorithms that apply computational linguistics, not pattern matching or statistical models.

In Dr. Dahlgren’s view, there is no question that a team approach to deploying semantic enterprise search is required. This means that IT professionals will work side-by-side with subject matter experts, search experts and vocabulary specialists to gain the best advantage from semantic search engines.

The unique language aspects of an enterprise content domain are as important as the software a company employs. The Cognition baseline semantic net, out-of-the-box, will always give reliable and better results than traditional string search engines. However, it gives top performance when enhanced with enterprise language, embedding all the ways that subject experts talk about their topical domain, jargon, acronyms, code phrases, etc.

With elements of its software already embedded in some notable commercial applications like Bing, Cognition is positioned for delivering excellent semantic search for an enterprise. They are taking on opportunities in areas like risk management that have been slow to adopt semantic tools. They will deliver software to these customers together with services and expertise to coach their clients through the implementation, deployment and maintenance essential to successful use. The enthusiasm expressed to me by Kathleen Dahlgren about semantics confirms what I also heard from Cognition clients. They are confident that the technology coupled with thoughtful guidance from their support services will be the true value-added for any enterprise semantic search application using Cognition.

The free download of the Gilbane study and deep-dive on Cognition was announced on their Web site at this page.

© 2018 Bluebill Advisors

Theme by Anders NorenUp ↑