Tag: Linguamatics

Why is it so Hard to “Get” Semantics Inside the Enterprise?

Semantic Software Technologies: Landscape of High Value Applications for the Enterprise was published just over a year ago. Since then the marketplace has been increasingly active; new products emerge and discussion about what semantics might mean for the enterprise is constant. One thing that continues to strike me is the difficulty of explaining the meaning of, applications for, and context of semantic technologies.

Browsing through the topics in this excellent blog site, http://semanticweb.com , it struck me as the proverbial case of the blind men describing an elephant. A blog, any blog, is linear. While there are tools to give a blog dimension by clustering topics or presenting related information, it is difficult to understand the full relationships of any one blog post to another. Without a photographic memory, an individual does not easily connect ideas across a multi-year domain of blog entries. Semantic technologies can facilitate that process.

Those who embrace some concept of semantics are believers that search will benefit from “semantic technologies.” What is less clear is how evangelists, developers, searchers and the average technology user can coalesce around the applications that will semantically enable enterprise search.

On the Internet content that successfully drives interest, sales, opinion and individual promotion does so through a combination of expert crafting of metadata, search engine technology that “understands” the language of the inquirer and the content that can satisfy the inquiry. Good answers are reached when questions are understood first and then the right content is selected to meet expectations.

In the enterprise, the same care must be given to metadata, search engine “meaning” analysis tools and query interpretation for successful outcomes. Magic does not happen without people behind the scenes to meet these three criteria executing linguistic curation, content enhancement and computational linguistic programming.

Three recent meeting events illustrate various states of semantic development and adoption, even as the next conference, Semantic Tech & Business Conference – Washington, D.C. on November 29 – is upon us:

Event 1 – A relatively new group, the IKS-Community funded by the EU has been supporting open source software developers since 2009. In July they held a workshop in Paris just past the mid-point of their life cycle. Attendees were primarily entrepreneurs and independent open source developers seeking pathways for their semantically “tuned” content management solutions. I was asked to suggest where opportunities and needs exist in US markets. They were an enthusiastic audience and are poised to meet the tough market realities of packaging highly sophisticated software for audiences that will rarely understand how complex the stuff “under the hood” really is. My principal charge to them was to create tools that “make it really easy” to work with vocabulary management and content metadata capture, updates, and enhancements.

Event 2. – On this side of the pond, UK firm Linguamatics hosted its user group meeting in Boston in October. Having interviewed a number of their customers last year to better understand their I2E product line, I was happy to meet people I had spoken with and see the enthusiasm of a user community vested in such complex technology. Most impressive is the respectful tone and thoughtful sharing between Linguamatics principals and their customers. They share the knowledge of how hard it is to continually improve search technology that delivers answers to semantically complex questions using highly specialized language. Content contributors and inquirers are all highly educated specialists seeking answers to questions that have never been asked before. Think about it, search engines designed to deliver results for frequently asked questions or to find content on popular topics is hard enough, but finding the answer to a brand new question is a quantum leap of difficulty in comparison.

To make matters even more complicated, answers to semantic (natural language) questions may be found in internal content, in published licensed content or some combination of both. In the latter case, only the seeker may be able to put the two together to derive or infer an answer.

Publishers of content for licensing play a convoluted game of how they will license their content to enterprises for semantic indexing in combination with internal content. The Linguamatics user community is primarily in life sciences; this is one more hurdle for them to overcome to effectively leverage the vast published repositories of biological and medical literature. Rigorous pricing may be good business strategy, but research using semantic search could make more headway with more reasonable royalties that reflect the need for collaborative use across teams and partners.

Content wants to be found and knowledge requires outlets to enable innovation to flourish. In too many cases technology is impaired by lack of business resources by buyers or arcane pricing models of sellers that hold vital information captive for a well-funded few. Semantically excellent retrieval depends on an engine’s indexing access to all contextually relevant content.

Event 3. – Leslie Owens of Forrester Research, at the Fall 2011 Enterprise Search Summit conducted a very interesting interactive session that further affirms the elephant and blind men metaphor. Leslie is a champion of metadata best practices and writes about the competencies and expertise needed to make valuable content accessible. She engaged the audience with a series of questions about its wants, needs, beliefs and plans for semantic technologies. As described in an earlier paragraph about how well semantics serves us on the Web, most of the audience puts its faith in that model but is doubtful of how or when similar benefits will accrue to enterprise search. Leslie and a couple of others made the point that a lot more work has to be done on the back-end on content in the enterprise to get these high-value outcomes.

We’ll keep making the point until more adopters of semantic technologies get serious and pay attention to content, content enhancement, expert vocabulary management and metadata. If it is automatic understanding of your content that you are seeking, the vocabulary you need is one that you build out and enhance for your enterprise’s relevance. Semantic tools need to know the special language you use to give the answers you need.

Semantically Focused and Building on a Successful Customer Base

Dr. Phil Hastings and Dr. David Milward spoke with me in June, 2010, as I was completing the Gilbane report, Semantic Software Technologies: A Landscape of High Value Applications for the Enterprise. My interest in a conversation was stimulated by several months of discussions with customers of numerous semantic software companies. Having heard perspectives from early adopters of Linguamatics’ I2E and other semantic software applications, I wanted to get some comments from two key officers of Linguamatics about what I heard from the field. Dr. Milward is a founder and CTO, and Dr. Hastings is the Director of Business Development.

A company with sustained profitability for nearly ten years in the enterprise semantic market space has credibility. Reactions from a maturing company to what users have to say are interesting and carry weight in any industry. My lines of inquiry and the commentary from the Linguamatics officers centered around their own view of the market and adoption experiences.

When asked about growth potential for the company outside of pharmaceuticals where Linguamatics already has high adoption and very enthusiastic users, Drs. Milward and Hastings asserted their ongoing principal focus in life sciences. They see a lot more potential in this market space, largely because of the vast amounts of unstructured content being generated, coupled with the very high-value problems that can be solved by text mining and semantically analyzing the data from those documents. Expanding their business further in the life sciences means that they will continue engaging in research projects with the academic community. It also means that Linguamatics semantic technology will be helping organizations solve problems related to healthcare and homeland security.

The wisdom of a measured and consistent approach comes through strongly when speaking with Linguamatics executives. They are highly focused and cite the pitfalls of trying to “do everything at once,” which would be the case if they were to pursue all markets overburdened with tons of unstructured content. While pharmaceutical terminology, a critical component of I2E, is complex and extensive, there are many aids to support it. The language of life sciences is in a constant state of being enriched through refinements to published thesauri and ontologies. However, in other industries with less technical language, Linguamatics can still provide important support to analyze content in the detection of signals and patterns of importance to intelligence and planning.

Much of the remainder of the interview centered on what I refer to as the “team competencies” of individuals who identify the need for any semantic software application; those are the people who select, implement and maintain it. When asked if this presents a challenge for Linguamatics or the market in general, Milward and Hastings acknowledged a learning curve and the need for a larger pool of experts for adoption. This is a professional growth opportunity for informatics and library science people. These professionals are often the first group to identify Linguamatics as a potential solutions provider for semantically challenging problems, leading business stakeholders to the company. They are also good advocates for selling the concept to management and explaining the strong benefits of semantic technology when it is applied to elicit value from otherwise under-leveraged content.

One Linguamatics core operating principal came through clearly when talking about the personnel issues of using I2E, which is the necessity of working closely with their customers. This means making sure that expectations about system requirements are correct, examples of deployments and “what the footprint might look like” are given, and best practices for implementations are shared. They want to be sure that their customers have a sense of being in a community of adopters and are not alone in the use of this pioneering technology. Building and sustaining close customer relationships is very important to Linguamatics, and that means an emphasis on services co-equally with selling licenses.

Linguamatics has come a long way since 2001. Besides a steady effort to improve and enhance their technology through regular product releases of I2E, there have been a lot of “show me” and “prove it” moments to which they have responded. Now, as confidence in and understanding of the technology ramps up, they are getting more complex and sophisticated questions from their customers and prospects. This is the exciting part as they are able to sell I2E’s ability to “synthesize new information from millions of sources in ways that humans cannot.” This is done by using the technology to keep track of and processing the voluminous connections among information resources that exceed human mental limits.

At this stage of growth, with early successes and excellent customer adoption, it was encouraging to hear the enthusiasm of two executives for the evolution of the industry and their opportunities in it.

The Gilbane report and a deep dive on Linguamatics are available through this Press Release on their Web site.

Enterprise Search 2008 Wrap-Up

It would be presumptuous to think that I could adequately summarize a very active year of evolution among a huge inventory of search technologies. This entry is more about what I have learned and what I opine about the state-of-the-market, than an analytical study and forecast.

The weak link in the search market is product selection methods. My first thought is that we are in a state of technological riches without clear guideposts for which search models work best in any given enterprise. Those tasked to select and purchase products are not well-educated about the marketplace but are usually not given budget or latitude to purchase expert analysis when it is available. It is a sad commentary to view how organizations grant travel budgets to attend conferences where only limited information can be gathered about products but will not spend a few hundred dollars on in-depth comparative expert analyses of a large array of products.

My sources for this observation are numerous, confirmed by speakers in our Gilbane conference search track sessions in Boston and San Francisco. As they related their personal case histories for selecting products, speakers shared no tales of actually doing literature searches or in-depth research using resources with a cost associated. This underscores another observation, those procuring search do not know how to search and operate in the belief that they can find “good enough” information using only “free stuff.” Even their review of material gathered is limited to skimming rather than a systematic reading for concrete facts. This does not make for well-reasoned selections. As noted in an earlier entry, a widely published chart stating that product X is a leader does nothing to enlighten your enterprise’s search for search. In one case, product leadership is determined primarily by the total software sales for the “leader” of which search is a miniscule portion.

Don’t expect satisfaction with search products to rise until buyers develop smarter methods for selection and better criteria for making a buy decision that suits a particular business need.

Random Thoughts. It will be a very long time before we see a universally useful, generic search function embedded in Microsoft (MS) product suites as a result of the FAST acquisition. Asked earlier in the year by a major news organization whether I though MS had paid too much for FAST, I responded “no” if what they wanted was market recognition but “yes” if they thought they were getting state-of-the-art-technology. My position holds; the financial and legal mess in Norway only complicates the road to meshing search technology from FAST with Microsoft customer needs.

I’ve wondered what has happened to the OmniFind suite of search offerings from IBM. One source tells me it makes IBM money because none of the various search products in the line-up are standalone, nor do they provide an easy transition path from one level of product to another for upward scaling and enhancements. IBM can embed any search product with any bundled platform of other options and charge for lots of services to bring it on-line with heavy customization.

Three platform vendors seem to be penetrating the market slowly but steadily by offering more cohesive solutions to retrieval. Native search solutions are bundled with complete content capture, publishing and search suites, purposed for various vertical and horizontal applications. These are Oracle, EMC, and OpenText. None of these are out-of-the-box offerings and their approach tends to appeal to larger organizations with staff for administration. At least they recognize the scope and scale of enterprise content and search demands, and customer needs.

On User Presentations at the Boston Gilbane Conference, I was very pleased with all sessions, the work and thought the speakers put into their talks. There were some noteworthy comments in those on Semantic Search and Text Technologies, Open Source and Search Appliances.

On the topic of semantic (contextual query and retrieval) search, text mining and analytics, the speakers covered the range of complexities in text retrieval, leaving the audience with a better understanding of how diverse this domain has become. Different software application solutions need to be employed based on point business problems to be solved. This will not change, and enterprises will need to discriminate about which aspects of their businesses need some form of semantically enabled retrieval and then match expectations to offerings. Large organizations will procure a number of solutions, all worthy and useful. Jeff Catlin of Lexalytics gave a clear set of definitions within this discipline, industry analyst Curt Monash provoked us with where to set expectations for various applications, and Win Carus of Information Extraction Systems illustrated the tasks extraction tools can perform to find meaning in a heap of content. The story has yet to be written on how semantic search is and will impact our use of information within organizations.

Leslie Owens of Forrester and Sid Probstein of Attivio helped to ground the discussion of when and why open source software is appropriate. The major take-way for me was an understanding of the type of organization that benefits most as a contributor and user of open source software. Simply put, you need to be heavily vested and engaged on the technical side to get out of open source what you need, to mold it to your purpose. If you do not have the developers to tackle coding, or the desire to share in a community of development, your enterprise’s expectations will not be met and disappointment is sure to follow.

Finally, several lively discussions about search appliance adoption and application (Google Search Appliance and Thunderstone) strengthen my case for doing homework and making expenditures on careful evaluations before jumping into procurement. While all the speakers seem to be making positive headway with their selected solutions, the path to success has involved more diversions and changes of course than necessary for some because the vetting and selecting process was too “quick and dirty” or dependent on too few information sources. This was revealed: true plug and play is an appliance myth.

What will 2009 bring? I’m looking forward to seeing more applications of products that interest me from companies that have impressed me with thoughtful and realistic approaches to their customers and target audiences. Here is an uncommon clustering of search products.

Multi-repository search across database applications, content collaboration stores document management systems and file shares: Coveo, Autonomy, Dieselpoint, dtSearch, Endeca, Exalead, Funnelback, Intellisearch, ISYS, Oracle, Polyspot, Recommind, Thunderstone, Vivisimo, and X1. In this list is something for every type of enterprise and budget.

Business and analytics focused software with intelligence gathering search: Attensity, Attivio, Basis Technology, ChartSearch, Lexalytics, SAS, and Temis.

Comprehensive solutions for capture, storage, metadata management and search for high quality management of content for targeted audiences: Access Innovations, Cuadra Associates, Inmagic, InQuira, Knova, Nstein, OpenText, ZyLAB.

Search engines with advanced semantic processing or natural language processing for high quality, contextually relevant retrieval when quantity of content makes human metadata indexing prohibitive: Cognition Technologies, Connotate, Expert System, Linguamatics, Semantra, and Sinequa

Content Classifier, thesaurus management, metadata server products have interplay with other search engines and a few have impressed me with their vision and thoughtful approach to the technologies: MarkLogic, MultiTes, Nstein, Schemalogic, Seaglex, and Siderean.

Search with a principal focus on SharePoint repositories: BA-Insight, Interse, Kroll Ontrack, and SurfRay.

Finally, some unique search applications are making serious inroads. These include Documill for visual and image, Eyealike for image and people, Krugle for source code, and Paglo for IT infrastructure search.

This is the list of companies that interest me because I think they are on track to provide good value and technology, many still small but with promise. As always, the proof will be in how they grow and how well they treat their customers.

That’s it for a wrap on Year 2 of the Enterprise Search Practice at the Gilbane Group. Check out our search studies at http://gilbane.com/Research-Reports.html and PLEASE let me hear your thoughts on my thoughts or any other search related topic via the contact information at http://gilbane.com/

© 2018 Bluebill Advisors

Theme by Anders NorenUp ↑