Archive for Enterprise search

Enterprise Search and Findability Survey 2014

For the third year, Findwise, an enterprise search system integrator and consulting firm based in Sweden, is doing a global survey of user experiences with content findability tools. Many of us in the field of search technology want to see how enterprises are progressing with their search initiatives. Having a baseline from 2012 is a beginning to see what the continuum looks like but we need more numbers from the user community. That means participation from institutional implementers, funding managers, administrators and end-users with a stake in the outcomes they receive when they use any search technology.

Please do not let this opportunity pass, and sign on to the survey and sign up to get the resulting report later in the year. I especially hope that Findwise gets a good uptick in responses to report at the December Gilbane conference. We need to hear more voices so pass this link along to colleagues in other organizations.

Here is the link: Enterprise Search and Findability Survey 2014

Content Accessibility in the Enterprise is Really Search

The Gilbane Conference call for speakers is out and submissions are due in three days, May 2. As one who has been writing about enterprise search topics for over ten years, and engaged in search technology development since 1974, I know it is still a relevant topic.

If you are engaged in any role, in any type of content repository development or use, you know that excellent accessibility is fundamental to making content valuable and useable. You are also probably involved in influencing or trying to influence decisions that will make certain that technology implementations have adequate staffing for content metadata and controlled vocabulary development.

Please take a look at this conference track outline and consider where your involvement can be inserted. Then submit a speaking proposal to share your direct experiences with search or a related topic. Our conference participants love to hear real stories of enterprise initiatives that illustrate: innovative approaches, practical solutions, workarounds to technical and business problems, and just plain scrappy projects that bring value to a group or to the whole enterprise.  In other words, how do you get the job done within the constraints you have faced?

Track E: Content, Collaboration and Employee Engagement

Designed for content, information, technical, and business managers focused on enterprise social, collaboration, intranet, portal, knowledge, and back-end content applications.

  • Collaboration and the social enterprise
  • Collaboration tools & social platforms
  • Enterprise social metrics
  • Community building & knowledge sharing
  • Content management & intranet strategies
  • Enterprise mobile strategies
  • Content and information integration
  • Enterprise search and information access
  • Semantic technologies
  • Taxonomies, metadata, tagging

Please consider participating in the conference and especially if content findability and accessibility are high on your list of “must have” content solutions. Submit your proposal here. The need for good findability of content has never been higher and your experiences must be heard by vendors, IT managers and content experts together in this forum.

Enterprise Search Europe special discount

enterprise search europeLast May I was delighted to participate in Enterprise Search Europe in London. There I found a committed contingent from companies seeking search solutions, entrepreneurs, and search technology integrators. They were there to share common enterprise experiences with search technologies and implementation issues. Usability, specialized business use cases and leveraging search results in business intelligence were the three areas I found most engaging. Missing from the audience was a group that belongs at this meeting: content managers. Among them should be expert taxonomists, metadata specialists, and information architects responsible for the many repositories that go into quality enterprise search deployments. Take advantage of the opportunity to pick up the great expertise that you will have access to at this meeting. I am happy to extend a 20% discount code to the meeting, so please consider using it. Apply MOULTON20 in the priority code field at online registration, which you can find at the conference site: http://www.enterprisesearcheurope.com/2014/ .

Findability Issues Impact Everything Work Related

This should have been the last post of 2013 but you know how the holidays and weather (snow removal) get in the way of real work. However, throughout the month of December emails and messages, meetings, and reading peppered me with reminders that search surrounds everything we do. In my modest enterprise, findability issues occupy a major portion of my day and probably yours, too.

Deciding how important search is for workers in any enterprise is easy to determine if we think about how so many of us go about our daily work routines:

  • Receiving and sending emails, text messages, voice mail,
  • Documenting and disseminating work results,
  • Attending meetings where we listen, contribute, view presentations and take notes,
  • Researching and studying new topics or legacy content to begin or execute a project

As content accrues, information of value that will be needed for future work activities, finding mechanisms come into play, or should. That is why I probably expend 50% of my day consuming content, determining relevance and importance, deciding where and how it needs to be preserved, and clearing out debris. The other 50% of the time is devoted to retrieving, digesting and creating new content, new formulations of found material. The most common outputs are the result of:

  • Evaluation of professionals who would be candidates for speaking at programs I help organize,
  • Studying for an understanding of client needs, challenges and work environments,
  • Evaluation of technology solutions and tools for clients and my own enterprise,
  • Responding to inquiries for information, introductions, how-to solve a problem, opinions about products, people or processes,
  • Preparing deliverables to clients related to projects

Without the means and methods of my finding systems, those used by my clients, and those in the public domain, no work would get done. It is just that simple.

So, what came at me in December that made the cut of information to be made findable? A lot, but here are just three examples.

Commentary on metadata and taxonomy governance was a major topic in one session I moderated at the Gilbane Conference in Boston, Dec. 3-4, 2014. All of the panelists shared terrific observations about how and why governance of metadata and taxonomies is enterprise-critical; from one came this post-conference blog post. It, Taxonomy Governance, was written by Heather Hedden, author of The Accidental Taxonomist and a frequent speaker on taxonomy topics. The point here: when you engage in any work activity to consistently organize and manage the professional content in your possession, you are governing that material for findability. Anything that improves the process in the enterprise, is going to be a findability plus, just as it is for your own content.

Also in December, the Boston KM Forum hosted Allan Lewis, an “informaticist” at Lahey Health in Massachusetts; he is responsible for an initiative that will support healthcare professionals’ sharing of information via social business software tools. As a healthcare informatics professional, working with electronic clinical data sets to better codify diagnostic information, Allan is engaging in an enterprise-wide project. It is based on the need for a common view of medical conditions, how to diagnose them, and assign accurate classification to ensure the best records. Here is an issue where the quality of governing rules will be reached through consensus among medical experts. Again, findability is a major goal of this effort for everyone in a system, from the clinicians who need to retrieve information to the business units who must track cases and outcomes for accountability.

Last, from among the hundreds of information resources crossing my desk last month came one, a “Thank you for donating to the Wikimedia Foundation. You are wonderful!” You might ask why this did not simply get filed away for my tax return preparation; it almost did but read on.

Throughout the year I have been involved in numerous projects that rely on my ability to find definitions or explanations of hundreds of topics outside my areas of expertise. Sometimes I use known resources, such as government agency web sites that specialize in a field, or those of professional associations and publications with content by experts in a domain. I depend on finding tools at those sites to get what I am looking for. You can be certain that I know which ones have quality findability and those with difficult to use search functions.

When all else fails, my Google search is usually formatted as “define: xxx yyy” to include a phrase or name I seek to better understand. A simple term or acronym will usually net a glossary definition but for more complex topics Wikipedia is the most prominent resource showing up in results. Sometimes it is just a “stub” with notations that the entry needs updating, but more often it is very complete with scores of links and citations to help further my research. During one period when I had been beating a path to its site on a frequent basis, a banner requesting a donation appeared and persisted. As a professional benefiting from its work, I contributed a very modest sum. When the thank you came, I found the entire correspondence compelling enough to share parts of it with my readers. The last paragraph is one I hope you will read because you are interested in “search” and probably have the knowledge to contribute content that others might search for. Contributions of money and your knowledge are both important.

It’s easy to ignore our fundraising banners, and I’m really glad you didn’t. This is how Wikipedia pays its bills — people like you giving us money, so we can keep the site freely available for everyone around the world.

People tell me they donate to Wikipedia because they find it useful, and they trust it because even though it’s not perfect, they know it’s written for them. Wikipedia isn’t meant to advance somebody’s PR agenda or push a particular ideology, or to persuade you to believe something that’s not true. …

You should know: your donation isn’t just covering your own costs. The average donor is paying for his or her own use of Wikipedia, plus the costs of hundreds of other people. …

Most people don’t know Wikipedia’s run by a non-profit. Please consider sharing this e-mail with a few of your friends to encourage them to donate too. And if you’re interested, you should try adding some new information to Wikipedia. If you see a typo or other small mistake, please fix it, and if you find something missing, please add it. There are resources here that can help you get started. Don’t worry about making a mistake: that’s normal when people first start editing and if it happens, other Wikipedians will be happy to fix it for you.

So, this is my opening for 2014, a reflection on what it means to be able to find what we need to do our work and keep it all straight. The plug for Wikipedia is not a shameless endorsement for any personal gain, just an acknowledgement that I respect and have benefitted from the collaborative spirit under which it operates. I am thanking them by sharing my experience with you.

Healthcare e-Commerce Search Lessons for the Enterprise

Search Tools Wanting on Many Exchanges: This headline was too good to pass up even though stories about the failures of the Affordable Care Act web site are wearing a little thin right now. For those of us long involved in developing, delivering and supporting large software solutions, we can only imagine all the project places that have brought about this massive melt-down. Seeing this result: “many who get through the log-in process on the new health insurance exchanges then have trouble determining whether the offered policies will provide the coverage they need”, we who spend hours on external and internal web sites know the frustrations very well. It is not the “search tools” that are lacking but the approach to design and development.

This current event serves as a cautionary tale to any enterprise attempting its own self-service web-site, for employees’ in-house use, customer service extranets or direct sales on public facing sites.

Here are the basic necessary requirements, for anyone launching large-scale site search, internally or externally.

Leadership in an endeavor of this scale requires deep understanding of the scope of the goals. All the goals must be met in the short term (enrollment of both the neediest without insurance AND enrollment of the young procrastinators), and scalable for the long term. What this requires is a single authority with:

  • Experience on major projects, global in reach, size and complexity
  • Knowledge of how all the entities in the healthcare industry work and inter-relate
  • Maturity, enough to understand and manage software engineers (designers), coders, business operations managers, writers, user interface specialists and business analysts with their myriad of personality types that will be doing the work to bring millions of computing elements into synch
  • The authority and control to hire, fire, and prioritize project elements.

Simplicity of site design to begin a proof of concept, or several proofs of concept, rolled out to real prospects using a minimalist approach with small teams. This a surer path to understanding what works and what doesn’t. Think of the approach to the Manhattan Project where multiple parallel efforts were employed to get to the quickest and most practical deployment of an atomic weapon. Groves had the leadership authority to shift initiative priorities as each group progressed and made a case for its approach. This more technically complex endeavor was achieved over a 4 year period, only one year more than this government healthcare site development. Because the ability to find information is the first step for almost every shopper, it makes sense to get search and navigation working smoothly first, even as content targets and partner sites are being readied for access. Again, deep understanding of the audience, what it wants to know first and how that audience will go about finding it is imperative. Usability experts with knowledge of the healthcare industry would be critical in such an effort. The priority is to enable a search before requiring identity. Forcing enrollment of multitudes of people who just want to search, many of whom will never become buyers (e.g. counselors, children helping elderly parents find information, insurers wanting to verify their own linkages and site flow from the main site) is madness. No successful e-commerce site demands this from a new visitor and the government healthcare site has no business harvesting a huge amount of personal data that it has no use for (i.e. marketing).

Hundreds of major enterprises have failed at massive search implementations because the focus was on the technology instead of the business need, the user need and content preparation. Good to excellent search will always depend on an excellent level of organization and categorization for the audience and use intended. That is how excellent e-commerce sites flourish. Uniformity, normalization, and consistency models take time to build and maintain. They need smart people with time to think through logical paths to information to do this work. It is not a task for programmers or business managers. Content specialists and taxonomists who have dealt with content in healthcare areas for years are needed.

How a public project could fail so badly will eventually be examined and the results made known. I will wager that these three basic elements were missing from day one: a single strong leader, a simple, multi-track development approach with prototyping and attention to preparing searchable content for the target audience. Here is a lesson learned for your enterprise.

Launching Your Search for Enterprise Search Fundamentals

It’s the beginning of a new year and you are tasked with responsibility for your enterprise to get top value from the organization’s information and knowledge assets. You are the IT applications specialist assigned to support individual business units with their technology requests. You might encounter situations similar to these:

  • Marketing has a major initiative to re-write all product marketing pieces.
  • Finance is grappling with two newly acquired companies whose financial reports, financial analyses, and forecasts are scattered across a number of repositories.
  • Your Legal department has a need to categorize and analyze several thousand “idea records” that came from the acquired companies in order to be prepared for future work, patenting new products.
  • Research and development is attempting to categorize, and integrate into a single system, R&D reports from an existing repository with those from the acquisitions.
  • Manufacturing requires access to all schematics for eight new products in order to refine and retool manufacturing processes and equipment in their production area.
  • Customer support demands just-in-time retrieval and accuracy to meet their contractual obligations to tier-one customers, often from field operations, or while in transit to customer sites. The latter case often requires retrieval of a single, unique piece of documentation.

All of these groups have needs, which if not met present high risk or even exposure to lawsuits from clients or investors. You have only one specialist on staff who has had two years of experience with a single search engine, but who is currently deployed to field service operations.

Looking at just these few examples we can see that a number of search related technologies plus human activities may be required to meet the needs of these diverse constituents. From finding and assembling all financial materials across a five-year time period for all business units, to recovering scattered and unclassified emails and memos that contain potential product ideas, the initiative may be huge. A sizable quantity of content and business structural complexity may require a large scale effort just to identify all possible repositories to search for. This repository identifying exercise is a problem to be solved before even thinking about the search technologies to adopt for the “finding” activity.

Beginning the development of a categorizing method and terminology to support possible “auto-categorization” might require text mining and text analysis applications to assess the topical nomenclature and entity attributes that would make a good starting point. These tools can be employed before the adoption of enterprise search applications.

Understanding all the “use-cases” for which engineers may seek schematics in their re-design and re-engineering of a manufacturing plant is essential to selecting the best search technology for them and testing it for deployment.

The bottom line is there is a lot more to know about content and supporting its accessibility with search technology than acquiring the search application. Furthermore, the situations that demand search solutions within the enterprise are far different, and their successful application requires far greater understanding of user search expectations than Web searching for a product or general research on a new topic.

To meet the full challenge of providing the technologies and infrastructure that will deliver reliable and high value information and knowledge when and where required, you must become conversant with a boatload of search related topics. So, where do you begin?

A new primer, manageable in length and logical in order has just been published. It contains the basics you will need to understand the enterprise context for search. A substantive list of reading resources, a glossary and vendor URL list round out the book. As the author suggests, and I concur, you should probably begin with Chapter 12, two pages that will ground you quickly in the key elements of your prospective undertaking.

What is the book? Enterprise Search (of course) by Martin White, O’Reilly Media, Inc., Sebastopol, CA. © 2013 Martin White. 192p. ISBN: 978-1-449-33044-6. Also available as an online edition at: http://my.safaribooksonline.com/book/databases/data-warehouses/9781449330439

First group of Gilbane sponsors posted for Boston conference

Conference planning is starting to ramp up. See our first group of Gilbane sponsors, and don’t forget the call for papers!

Collaboration, Convergence and Adoption

Here we are, half way through 2011, and on track for a banner year in the adoption of enterprise search, text mining/text analytics, and their integration with collaborative content platforms. You might ask for evidence; what I can offer is anecdotal observations. Others track industry growth in terms of dollars spent but that makes me leery when, over the past half dozen years, there has been so much disappointment expressed with the failures of legacy software applications to deliver satisfactory results. My antenna tells me we are on the cusp of expectations beginning to match reality as enterprises are finding better ways to select, procure, implement, and deploy applications that meet business needs.

What follows are my happy observations, after attending the 2011 Enterprise Search Summit in New York and 2011 Text Analytics Summit in Boston. Other inputs for me continue to be a varied reading list of information industry publications, business news, vendor press releases and web presentations, and blogs, plus conversations with clients and software vendors. While this blog is normally focused on enterprise search, experiencing and following content management technologies, and system integration tools contribute valuable insights into all applications that contribute to search successes and frustrations.

Collaboration tools and platforms gained early traction in the 1990s as technology offerings to the knowledge management crowd. The idea was that teams and workgroups needed ways to share knowledge through contribution of work products (documents) to “places” for all to view. Document management systems inserted themselves into the landscape for managing the development of work products (creating, editing, collaborative editing, etc.). However, collaboration spaces and document editing and version control activities remained applications more apart than synchronized.

The collaboration space has been redefined largely because SharePoint now dominates current discussions about collaboration platforms and activities. While early collaboration platforms were carefully structured to provide a thoughtfully bounded environment for sharing content, their lack of provision for idiosyncratic and often necessary workflows probably limited market dominance.

SharePoint changed the conversation to one of build-it-to-do-anything-you-want-the way-you-want (BITDAYWTWYW). What IT clearly wants is single vendor architecture that delivers content creation, management, collaboration, and search. What end-users want is workflow efficiency and reliable search results. This introduces another level of collaborative imperative, since the BITDAYWTWYW model requires expertise that few enterprise IT support people carry and fewer end-users would trust to their IT departments. So, third-party developers or software offerings become the collaborative option. SharePoint is not the only collaboration software but, because of its dominance, a large second tier of partner vendors is turning SharePoint adopters on to its potential. Collaboration of this type in the marketplace is ramping wildly.

Convergence of technologies and companies is on the rise, as well. The non-Microsoft platform companies, OpenText, Oracle, and IBM are placing their strategies on tightly integrating their solid cache of acquired mature products. These acquisitions have plugged gaps in text mining, analytics, and vocabulary management areas. Google and Autonomy are also entering this territory although they are still short on the maturity model. The convergence of document management, electronic content management, text and data mining, analytics, e-discovery, a variety of semantic tools, and search technologies are shoring up the “big-platform” vendors to deal with “big-data.”
Sitting on the periphery is the open source movement. It is finding ways to alternatively collaborate with the dominant commercial players, disrupt select application niches (e. g. WCM ), and contribute solutions where neither the SharePoint model nor the big platform, tightly integrated models can win easy adoption. Lucene/Solr is finding acceptance in the government and non-profit sectors but also appeal to SMBs.

All of these factors were actively on display at the two meetings but the most encouraging outcomes that I observed were:

  • Rise in attendance at both meetings
  • More knowledgeable and experienced attendees
  • Significant increase in end-user presentations

The latter brings me back to the adoption issue. Enterprises, which previously sent people to learn about technologies and products to earlier meetings, are now in the implementation and deployment stages. Thus, they are now able to contribute presentations with real experience and commentary about products. Presenters are commenting on adoption issues, usability, governance, successful practices and pitfalls or unresolved issues.

Adoption is what will drive product improvements in the marketplace because experienced adopters are speaking out on their activities. Public presentations of user experiences can and should establish expectations for better tools, better vendor relationship experiences, more collaboration among products and ultimately, reduced complexity in the implementation and deployment of products.

Coherence and Augmentation: KM-Search Connection

This space is not normally used to comment on knowledge management (KM), one of my areas of consulting, but a recent conference gives me an opening to connect the dots between KM and search. Dave Snowden and Tom Stewart always have worthy commentary on KM and as keynote speakers they did not disappoint at KMWorld. It may seem a stretch but by taking a few of their thoughts out of context, I can synthesize a relationship between KM and search.

KMWorld, Enterprise Search Summit, SharePoint Symposium and Taxonomy Boot Camp moved to Washington D.C. for the 2010 Fall Conference earlier this month. I attended to teach a workshop on building a semantic platform, and to participate in a panel discussion to wrap up the conference with two other analysts, Leslie Owen and Tony Byrne with Jane Dysart moderating.

Comments from the first and last keynote speakers of the conference inspired my final panel comments, counseling attendees to lead by thoughtfully leveraging technology only to enhance knowledge. But there were other snippets that prompt me to link search and KM.

Tom Stewart’s talk was entitled, Knowledge Driven Enterprises: Strategies & Future Focus, which he couched in the context of achieving a “coherent” winning organization. He explained that to reach the coherence destination requires understanding of different types of knowledge and how we need to behave for attaining each type (e.g. “knowable complicated “knowledge calls for experts and research; “emergent complex” knowledge calls for leadership and “sense-making.”).

Stewart describes successful organizations as those in which “the opportunities outside line up with the capabilities inside.” He explains that those “companies who do manage to reestablish focus around an aligned set of key capabilities” use their “intellectual capital” to identify their intangible assets,” human capability, structural capital, and customer capital. They build relationship capital from among these capabilities to create a coherent company. Although Stewart does not mention “search,” it is important to note that one means to identify intangible assets is well-executed enterprise search with associated analytical tools.

Dave Snowden also referenced “coherence,” (messy coherence), even as he spoke about how failures tend to be more teachable (memorable) than successes. If you follow Snowden, you know that he founded the Cognitive Edge and has developed a model for applying cognitive learning to help build resilient organizations. He has taught complexity analysis and sense-making for many years and his interest in human learning behaviors is deep.

To follow the entire thread of Snowden’s presentation on the “The Resilient Organization” follow this link. I was particularly impressed with his statement about the talk, “one of the most heart-felt I have given in recent years.” It was one of his best but two particular comments bring me to the connection between KM and search.

Dave talked about technology as “cognitive augmentation,” its only truly useful function. He also puts forth what he calls the “three Golden rules: Use of distributed cognition, wisdom but not foolishness of crowds; finely grained objects, information and organizational; and disintermediation, putting decision makers in direct contact with raw data.”

Taking these fragments of Snowden’s talk, a technique he seems to encourage, I put forth a synthesized view of how knowledge and search technologies need to be married for consequential gain.

We live and work in a highly chaotic information soup, one in which we are fed a steady diet of fragments (links, tweets, analyzed content) from which we are challenged as thinkers to derive coherence. The best knowledge practitioners will leverage this messiness by detecting weak signals and seek out more fragments, coupling them thoughtfully with “raw data” to synthesize new innovations, whether they be practices, inventions or policies. Managing shifting technologies, changing information inputs, and learning from failures (our own, our institution’s and others) contributes to building a resilient organization.

So where does “search” come in? Search is a human operation and begins with the workforce. Going back to Stewart who commented on the need to recognize different kinds of knowledge, I posit that different kinds of knowledge demand different kinds of search. This is precisely what so many “enterprise search” initiatives fail to deliver. Implementers fail to account for all the different kinds of search, search for facts, search for expertise, search for specific artifacts, search for trends, search for missing data, etc.

When Dave Snowden states that “all of your workforce is a human scanner,” this could also imply the need for multiple, co-occurring search initiatives. Just as each workforce member brings a different perspective and capability to sensory information gathering, so too must enterprise search be set up to accommodate all the different kinds of knowledge gathering. And when Snowden notes that “There are limits to semantic technologies: Language is constantly changing so there is a requirement for constant tuning to sustain the same level of good results,” he is reminding us that technology is only good for cognitive augmentation. Technology is not a “plug ‘n play,” install and reap magical cognitive insights. It requires constant tuning to adapt to new kinds of knowledge.

The point is one I have made before; it is the human connection, human scanner and human understanding of all the kinds of knowledge we need in order to bring coherence to an organization. The better we balance these human capabilities, the more resilient we’ll be and the better skilled at figuring out what kinds of search technologies really make sense for today, and tomorrow we had better be ready for another tool for new fragments and new knowledge synthesis.

Lucene Open Source Community Commits to a Future in Search

It has been nearly two years since I commented on an article in Information Week, Open Source, Its Time has Come, Nov. 2008. My main point was the need for deep expertise to execute enterprise search really well. I predicted the growth of service companies with that expertise, particularly for open source search. Not long after I announced that, Lucid Imagination was launched, with its focus on building and supporting solutions based on Lucene and, its more turnkey version, Solr.

It has not taken long for Lucid Imagination (LI) to take charge of the Lucene/Solr community of practice (CoP), and to launch its own platform built on Solr, Lucidworks Enterprise. Open source depends on deep and sustained collaboration; LI stepped into the breach to ensure that the hundreds of contributors, users and committers have a forum. I am pretty committed to CoPs myself and know that nurturing a community for the long haul takes dedicated leadership. In this case it is undoubtedly enlightened self-interest that is driving LI. They are poised to become the strongest presence for driving continuous improvements to open source search, with Apache Lucene as the foundation.

Two weeks ago LI hosted Lucene Revolution, the first such conference in the US. It was attended by over 300 in Boston, October 7-8 and I can report that this CoP is vibrant, enthusiastic. Moderated by Steve Arnold, the program ran smoothly and with excellent sessions. Those I attended reflected a respectful exchange of opinions and ideas about tools, methods, practices and priorities. While there were allusions to vigorous debate among committers about priorities for code changes and upgrades, the mood was collaborative in spirit and tinged with humor, always a good way to operate when emotions and convictions are on stage.

From my 12 pages of notes come observations about the three principal categories of sessions:

  1. Discussions, debates and show-cases for significant changes or calls for changes to the code
  2. Case studies based on enterprise search applications and experiences
  3. Case studies based on the use of Lucene and Solr embedded in commercial applications

Since the first category was more technical in nature, I leave the reader with my simplistic conclusions: core Apache Lucene and Solr will continue to evolve in a robust and aggressive progression. There are sufficient committers to make a serious contribution. Many who have decades of search experience are driving the charge and they have cut their teeth on the more difficult problems of implementing enterprise solutions. In announcing Lucidworks Enterprise, LI is clearly bidding to become a new force in the enterprise search market.

New and sustained build-outs of Lucene/Solr will be challenged by developers with ideas for diverging architectures, or “forking” code, on which Eric Gries, LI CEO, commented in the final panel. He predicted that forking will probably be driven by the need to solve specific search problems that current code does not accommodate. This will probably be more of a challenge for the spinoffs than the core Lucene developers, and the difficulty of sustaining separate versions will ultimately fail.

Enterprise search cases reflected those for whom commercial turnkey applications will not or cannot easily be selected; for them open source will make sense. Coming from LI’s counterpart in the Linux world, Red Hat, are these earlier observations about why enterprises should seek to embrace open source solutions, in short the sorry state of quality assurance and code control in commercial products. Add to that the cost of services to install, implement and customize commercial search products. The argument would be to go with open source for many institutions when there is an imperative or call for major customization.

This appears to be the case for two types of enterprises that were featured on the program: educational institutions and government agencies. Both have procurement issues when it comes to making large capital expenditures. For them it is easier to begin with something free, like open source software, then make incremental improvements and customize over time. Labor and services are cost variables that can be distributed more creatively using multiple funding options. Featured on the program were the Smithsonian, Adhere Solutions doing systems integration work for a number of government agencies, MITRE (a federally funded research laboratory), U. of Michigan, and Yale. CISCO also presented, a noteworthy commercial enterprise putting Lucene/Solr to work.

The third category of presenters was, by far, the largest contingent of open source search adopters, producers of applications that leverage Lucene and Solr (and other open source software) into their offerings. They are solidly entrenched because they are diligent committers, and share in this community of like-minded practitioners who serve as an extended enterprise of technical resources that keeps their overhead low. I can imagine the attractiveness of a lean business that can run with an open source foundation, and operates in a highly agile mode. This must be enticing and exciting for developers who wilt at the idea of working in a constrained environment with layers of management and political maneuvering.

Among the companies building applications on Lucene that presented were: Access Innovations, Twitter, LinkedIn, Acquia, RivetLogic and Salesforce.com. These stand out as relatively mature adopters with traction in the marketplace. There were also companies present that contribute their value through Lucene/Solr partnerships in which their products or tools are complementary including: Basis Technology, Documill, and Loggly.

Links to presentations by organizations mentioned above will take you to conference highlights. Some will appeal to the technical reader for there was a lot of code sharing and technical tips in the slides. The diversity and scale of applications that are being supported by Lucene and Solr was impressive. Lucid Imagination and the speakers did a great job of illustrating why and how open source has a serious future in enterprise search. This was a confidence building exercise for the community.

Two sentiments at the end summed it up for me. On the technical front Eric Gries observed that it is usually clear what needs to be core (to the code) and what does not belong. Then there is a lot of gray area, and that will contribute to constant debate in the community. For the user community, Charlie Hull, of flax opined that customers don’t care whether (the code) is in the open source core or in the special “secret sauce” application, as long as the product does what they want.