Collaboration, Convergence and Adoption

Here we are, half way through 2011, and on track for a banner year in the adoption of enterprise search, text mining/text analytics, and their integration with collaborative content platforms. You might ask for evidence; what I can offer is anecdotal observations. Others track industry growth in terms of dollars spent but that makes me leery when, over the past half dozen years, there has been so much disappointment expressed with the failures of legacy software applications to deliver satisfactory results. My antenna tells me we are on the cusp of expectations beginning to match reality as enterprises are finding better ways to select, procure, implement, and deploy applications that meet business needs.

What follows are my happy observations, after attending the 2011 Enterprise Search Summit in New York and 2011 Text Analytics Summit in Boston. Other inputs for me continue to be a varied reading list of information industry publications, business news, vendor press releases and web presentations, and blogs, plus conversations with clients and software vendors. While this blog is normally focused on enterprise search, experiencing and following content management technologies, and system integration tools contribute valuable insights into all applications that contribute to search successes and frustrations.

Collaboration tools and platforms gained early traction in the 1990s as technology offerings to the knowledge management crowd. The idea was that teams and workgroups needed ways to share knowledge through contribution of work products (documents) to “places” for all to view. Document management systems inserted themselves into the landscape for managing the development of work products (creating, editing, collaborative editing, etc.). However, collaboration spaces and document editing and version control activities remained applications more apart than synchronized.

The collaboration space has been redefined largely because SharePoint now dominates current discussions about collaboration platforms and activities. While early collaboration platforms were carefully structured to provide a thoughtfully bounded environment for sharing content, their lack of provision for idiosyncratic and often necessary workflows probably limited market dominance.

SharePoint changed the conversation to one of build-it-to-do-anything-you-want-the way-you-want (BITDAYWTWYW). What IT clearly wants is single vendor architecture that delivers content creation, management, collaboration, and search. What end-users want is workflow efficiency and reliable search results. This introduces another level of collaborative imperative, since the BITDAYWTWYW model requires expertise that few enterprise IT support people carry and fewer end-users would trust to their IT departments. So, third-party developers or software offerings become the collaborative option. SharePoint is not the only collaboration software but, because of its dominance, a large second tier of partner vendors is turning SharePoint adopters on to its potential. Collaboration of this type in the marketplace is ramping wildly.

Convergence of technologies and companies is on the rise, as well. The non-Microsoft platform companies, OpenText, Oracle, and IBM are placing their strategies on tightly integrating their solid cache of acquired mature products. These acquisitions have plugged gaps in text mining, analytics, and vocabulary management areas. Google and Autonomy are also entering this territory although they are still short on the maturity model. The convergence of document management, electronic content management, text and data mining, analytics, e-discovery, a variety of semantic tools, and search technologies are shoring up the “big-platform” vendors to deal with “big-data.”

Sitting on the periphery is the open source movement. It is finding ways to alternatively collaborate with the dominant commercial players, disrupt select application niches (e. g. WCM ), and contribute solutions where neither the SharePoint model nor the big platform, tightly integrated models can win easy adoption. Lucene/Solr is finding acceptance in the government and non-profit sectors but also appeal to SMBs.

All of these factors were actively on display at the two meetings but the most encouraging outcomes that I observed were:

  • Rise in attendance at both meetings
  • More knowledgeable and experienced attendees
  • Significant increase in end-user presentations

The latter brings me back to the adoption issue. Enterprises, which previously sent people to learn about technologies and products to earlier meetings, are now in the implementation and deployment stages. Thus, they are now able to contribute presentations with real experience and commentary about products. Presenters are commenting on adoption issues, usability, governance, successful practices and pitfalls or unresolved issues.

Adoption is what will drive product improvements in the marketplace because experienced adopters are speaking out on their activities. Public presentations of user experiences can and should establish expectations for better tools, better vendor relationship experiences, more collaboration among products and ultimately, reduced complexity in the implementation and deployment of products.

Understanding the Smart Content Technology Landscape

If you have been following recent XML Technologies blog entries, you will notice we have been talking a lot lately about XML Smart Content, what it is and the benefits it can bring to an organization. These include flexible, dynamic assembly for delivery to different audiences, search optimization to improve customer experience, and improvements for distributed collaboration. Great targets to aim for, but you may ask are we ready to pursue these opportunities? It might help to better understand the technology landscape involved in creating and delivering smart content.

The figure below illustrates the technology landscape for smart content. At the center are fundamental XML technologies for creating modular content, managing it as discrete chunks (with or without a formal content management system), and publishing it in an organized fashion. These are the basic technologies for “one source, one output” applications, sometimes referred to as Singe Source Publishing (SSP) systems.

XML and Smart Content Landscape

The innermost ring contains capabilities that are needed even when using a dedicated word processor or layout tool, including editing, rendering, and some limited content storage capabilities. In the middle ring are the technologies that enable single-sourcing content components for reuse in multiple outputs. They include a more robust content management environment, often with workflow management tools, as well as multi-channel formatting and delivery capabilities and structured editing tools. The outermost ring includes the technologies for smart content applications, which are described below in more detail.

It is good to note that smart content solutions rely on structured editing, component management, and multi-channel delivery as foundational capabilities, augmented with content enrichment, topic component assembly, and social publishing capabilities across a distributed network. Descriptions of the additional capabilities needed for smart content applications follow.

Content Enrichment / Metadata Management: Once a descriptive metadata taxonomy is created or adopted, its use for content enrichment will depend on tools for analyzing and/or applying the metadata. These can be manual dialogs, automated scripts and crawlers, or a combination of approaches. Automated scripts can be created to interrogate the content to determine what it is about and to extract key information for use as metadata. Automated tools are efficient and scalable, but generally do not apply metadata with the same accuracy as manual processes. Manual processes, while ensuring better enrichment, are labor intensive and not scalable for large volumes of content. A combination of manual and automated processes and tools is the most likely approach in a smart content environment. Taxonomies may be extensible over time and can require administrative tools for editorial control and term management.

Component Discovery / Assembly: Once data has been enriched, tools for searching and selecting content based on the enrichment criteria will enable more precise discovery and access. Search mechanisms can use metadata to improve search results compared to full text searching. Information architects and organizers of content can use smart searching to discover what content exists, and what still needs to be developed to proactively manage and curate the content. These same discovery and searching capabilities can be used to automatically create delivery maps and dynamically assemble content organized using them.

Distributed Collaboration / Social Publishing: Componentized information lends itself to a more granular update and maintenance process, enabling several users to simultaneously access topics that may appear in a single deliverable form to reduce schedules. Subject matter experts, both remote and local, may be included in review and content creation processes at key steps. Users of the information may want to “self-organize” the content of greatest interest to them, and even augment or comment upon specific topics. A distributed social publishing capability will enable a broader range of contributors to participate in the creation, review and updating of content in new ways.

Federated Content Management / Access: Smart content solutions can integrate content without duplicating it in multiple places, rather accessing it across the network in the original storage repository. This federated content approach requires the repositories to have integration capabilities to access content stored in other systems, platforms, and environments. A federated system architecture will rely on interoperability standards (such as CMIS), system agnostic expressions of data models (such as XML Schemas), and a robust network infrastructure (such as the Internet).

These capabilities address a broader range of business activity and, therefore, fulfill more business requirements than single-source content solutions. Assessing your ability to implement these capabilities is essential in evaluating your organizations readiness for a smart content solution.

Enterprise Search and Collaboration, or is it Compliance?

For two weeks in a row I have been struck by the appearance of full page ads on the inside cover of Information Week for Autonomy ControlPoint. For a leading search vendor, this positioning is interesting and raises a number of rhetorical questions about Autonomy’s direction and perhaps even the positioning of search in the marketplace. Top of my mind are these:

  • How will Autonomy be viewed by IT folks, whom I assume are the principal readers of Information Week?
  • Is this a shift away from an emphasis on search as “search” by Autonomy?
  • Is Autonomy just expanding its range to broader business interests to gain better enterprise penetration?
  • Will their deep technical competence in search be as rich in the areas of governance and compliance?

To try to get a handle on all of this, since the second ad had no URL, I went to the electronic version online at Information Week archives but discovered that the ads don’t appear in the PDF. No problem; I went to the advertisers’ index and clicked on the Autonomy link, thinking that the link would take me to the ControlPoint pages on their Web site. It only took me to the main page for Autonomy where there was nothing referring to ControlPoint, compliance, regulation or governance (all words prominent in the magazine print ads). I tried the drop-down for Products; nothing there either. At least Autonomy uses IDOL as its search engine on its own Web site, so I tried it. Yea! ControlPoint appeared in the results; the first entry got me to a page describing it.

But what else did I learn by following the breadcrumbs? A step back to the “products” level brought me to an Autonomy Electronics Records Management description and I began to notice the logo in the upper right said “Autonomy Meridio.” Lots of clicks later, I discovered that Meridio was acquired by Autonomy in 2007, which I probably would have known if I had paid more attention to “non-search” stuff. ControlPoint belongs in that family of products. When I clicked on this sidebar link, Autonomy ControlPoint: Information Governance for SharePoint and this one, Meridio eDRM for Microsoft Office, more questions came to mind:

  • Is Autonomy, the search company with its Meridio and Interwoven acquisitions, having a serious run at Microsoft by entering their traditional markets?
  • If an office tools software company like Microsoft slides into the search market by acquiring FAST and then leverages its great success with SharePoint by making FAST its default search offering, why shouldn’t Autonomy turn the tables?
  • By appealing to IT professionals will Autonomy be able to gain mind share that pits them directly against Microsoft with language like “Named Email and Compliance Vendor of the Year by Financial-i” and “Is SharePoint enough?”

Yes, we are going well Beyond Search, aren’t we?

Collaboration and Expertise Bring Focus to Enterprise Search

The topic of the month seems to be “social search;” I confess to being a willing participant in this new semantic framing of a rash of innovative new tools for enterprise search products. I would, however, defer to the professional intent of some great new features by stressing that this is really a next step in bringing collaboration closer to where expert knowledge workers do their work. As I view enterprises with a heavy research component, 10 – 30% of the average professional’s time is spent in a search environment. In other words, we all spend a lot of our day just looking for “stuff.” We also spend a significant amount of time in meetings, exchanging emails, and making presentations. More and more of us contribute to collaboration spaces where we work together on various types of document production.

Putting together the work habits and needs of a time-poor and information-rich community of knowledge workers in a post-processing environment where they can “mash up,” tag and commentate their search discoveries is a natural evolution of search technology. It is remarkable to see how search companies that are serious about the enterprise market (search within and for the enterprise) are rapidly turning out enhancements for their audiences, now that they are convinced that “Enterprise 2.0” has a boatload of early adopters in the wings. Search should always be about connecting experts and their content. Add collaboration and the ability to enrich search results by searchers for the benefit of their colleagues and you have a model for, soon-to-be, heavily adopted products.

That pretty much sums up how we should be thinking about “social search” in the enterprise. You can hear more of my views in a KMWorld Webinar, Using Social Search to Drive Innovation through Collaboration next Tuesday in a presentation sponsored by Vivisimo, one of the leaders in this area.

The week had plenty of virtual ink devoted this topic so you might want to check out these two articles with more commentary. The first was in eWeek, by Clint Boulton, Vivisimo Marries Search, Social Networking. The second shows that Google is on the bandwagon, as well, Google Enterprise Search gets social, a blog entry at C|Net by Rafe Needleman.

