Tag: Lucene

Lucene Open Source Community Commits to a Future in Search

It has been nearly two years since I commented on an article in Information Week, Open Source, Its Time has Come, Nov. 2008. My main point was the need for deep expertise to execute enterprise search really well. I predicted the growth of service companies with that expertise, particularly for open source search. Not long after I announced that, Lucid Imagination was launched, with its focus on building and supporting solutions based on Lucene and, its more turnkey version, Solr.

It has not taken long for Lucid Imagination (LI) to take charge of the Lucene/Solr community of practice (CoP), and to launch its own platform built on Solr, Lucidworks Enterprise. Open source depends on deep and sustained collaboration; LI stepped into the breach to ensure that the hundreds of contributors, users and committers have a forum. I am pretty committed to CoPs myself and know that nurturing a community for the long haul takes dedicated leadership. In this case it is undoubtedly enlightened self-interest that is driving LI. They are poised to become the strongest presence for driving continuous improvements to open source search, with Apache Lucene as the foundation.

Two weeks ago LI hosted Lucene Revolution, the first such conference in the US. It was attended by over 300 in Boston, October 7-8 and I can report that this CoP is vibrant, enthusiastic. Moderated by Steve Arnold, the program ran smoothly and with excellent sessions. Those I attended reflected a respectful exchange of opinions and ideas about tools, methods, practices and priorities. While there were allusions to vigorous debate among committers about priorities for code changes and upgrades, the mood was collaborative in spirit and tinged with humor, always a good way to operate when emotions and convictions are on stage.

From my 12 pages of notes come observations about the three principal categories of sessions:

  1. Discussions, debates and show-cases for significant changes or calls for changes to the code
  2. Case studies based on enterprise search applications and experiences
  3. Case studies based on the use of Lucene and Solr embedded in commercial applications

Since the first category was more technical in nature, I leave the reader with my simplistic conclusions: core Apache Lucene and Solr will continue to evolve in a robust and aggressive progression. There are sufficient committers to make a serious contribution. Many who have decades of search experience are driving the charge and they have cut their teeth on the more difficult problems of implementing enterprise solutions. In announcing Lucidworks Enterprise, LI is clearly bidding to become a new force in the enterprise search market.

New and sustained build-outs of Lucene/Solr will be challenged by developers with ideas for diverging architectures, or “forking” code, on which Eric Gries, LI CEO, commented in the final panel. He predicted that forking will probably be driven by the need to solve specific search problems that current code does not accommodate. This will probably be more of a challenge for the spinoffs than the core Lucene developers, and the difficulty of sustaining separate versions will ultimately fail.

Enterprise search cases reflected those for whom commercial turnkey applications will not or cannot easily be selected; for them open source will make sense. Coming from LI’s counterpart in the Linux world, Red Hat, are these earlier observations about why enterprises should seek to embrace open source solutions, in short the sorry state of quality assurance and code control in commercial products. Add to that the cost of services to install, implement and customize commercial search products. The argument would be to go with open source for many institutions when there is an imperative or call for major customization.

This appears to be the case for two types of enterprises that were featured on the program: educational institutions and government agencies. Both have procurement issues when it comes to making large capital expenditures. For them it is easier to begin with something free, like open source software, then make incremental improvements and customize over time. Labor and services are cost variables that can be distributed more creatively using multiple funding options. Featured on the program were the Smithsonian, Adhere Solutions doing systems integration work for a number of government agencies, MITRE (a federally funded research laboratory), U. of Michigan, and Yale. CISCO also presented, a noteworthy commercial enterprise putting Lucene/Solr to work.

The third category of presenters was, by far, the largest contingent of open source search adopters, producers of applications that leverage Lucene and Solr (and other open source software) into their offerings. They are solidly entrenched because they are diligent committers, and share in this community of like-minded practitioners who serve as an extended enterprise of technical resources that keeps their overhead low. I can imagine the attractiveness of a lean business that can run with an open source foundation, and operates in a highly agile mode. This must be enticing and exciting for developers who wilt at the idea of working in a constrained environment with layers of management and political maneuvering.

Among the companies building applications on Lucene that presented were: Access Innovations, Twitter, LinkedIn, Acquia, RivetLogic and Salesforce.com. These stand out as relatively mature adopters with traction in the marketplace. There were also companies present that contribute their value through Lucene/Solr partnerships in which their products or tools are complementary including: Basis Technology, Documill, and Loggly.

Links to presentations by organizations mentioned above will take you to conference highlights. Some will appeal to the technical reader for there was a lot of code sharing and technical tips in the slides. The diversity and scale of applications that are being supported by Lucene and Solr was impressive. Lucid Imagination and the speakers did a great job of illustrating why and how open source has a serious future in enterprise search. This was a confidence building exercise for the community.

Two sentiments at the end summed it up for me. On the technical front Eric Gries observed that it is usually clear what needs to be core (to the code) and what does not belong. Then there is a lot of gray area, and that will contribute to constant debate in the community. For the user community, Charlie Hull, of flax opined that customers don’t care whether (the code) is in the open source core or in the special “secret sauce” application, as long as the product does what they want.

Case Studies and Guidance for Search Implementations

We’ll be covering a chunk of the search landscape at the Gilbane Conference next week. While there are nominally over 100 search solutions that target some aspect of enterprise search, there will be plenty to learn from the dozen or so case studies and tool options described. Commentary and examples include: Attivio, Coveo, Exalead, Google Search Appliance (GSA), IntelliSearch, Lexalytics, Lucene, Oracle Secure Enterprise Search, Thunderstone and references to others. Our speakers will cue us into the current state of the search as it is being implemented. Several exhibitors are also on site to demonstrate their capabilities and they represent some of the best. Check out the program lineup below and try to make it to Boston to hear those with hands-on experience.

EST-1: Plug-and Play: Enterprise Experiences with Search Appliances

  • So you want to implement an enterprise search solution? Speaker: Angela A. Foster, FedEx Services, FedEx.com Development, and Dennis Shirokov, Marketing Manager, FedEx Digital Access Marketing.
  • The Make or Buy Decision at the U.S. General Services Admin. Speaker: Thomas Schaefer, Systems Analyst and Consultant, U.S. General Services Administration
  • Process and Architecture for Implementing GSA at MITRE. Robert Joachim, Info Systems Engr, Lead, The MITRE Corporation.

EST-2: Search in the Enterprise When SharePoint is in the Mix

  • Enterprise Report Management: Bringing High Value Content into the Flow of Business Action. Speaker: Ajay Kapur, VP of Product Development, Apps Associates
  • Content Supply? Meet Knowledge Demand: Coveo SharePoint integration. Speaker: Marc Solomon, Knowledge Planner, PRTM.
  • In Search of the Perfect Search: Google Search on the Intranet. Speaker: June Nugent, Director of Corporate Knowledge Resources, NetScout Systems,

EST-3: Open Source Search Applied in the Enterprise

  • Context for Open Source Implementations. Speaker: Leslie Owen, Analyst, Forrester Research
  • Intelligent Integration: Combining Search and BI Capabilities for Unified Information Access. Speaker: Sid Probstien, CTO, Attivio

EST-4: Search Systems: Care and Feeding for Optimal Results

  • Getting Off to a Strong Start with Your Search Taxonomy. Speaker: Heather Hedden, Principal Hedden Information Management
  • Getting the Puzzle Pieces to Fit; Finding the Right Search Solution(s) Patricia Eagan, Sr. Mgr, Web Communications, The Jackson Laboratory.
  • How Organizations Need to Think About Search. Speaker: Rob Wiesenberg, President & Founder, Contegra Systems

EST-5: Text Analytics/Semantic Search: Parsing the Language

  • Overview and Differentiators: Text Analytics, Text Mining and Semantic Technologies. Jeff Catlin, CEO, Lexalytics
  • Reality and Hype in the Text Retrieval Market. Curt Monash, President, Monash Research.
  • Two Linguistic Approaches to Search: Natural Language Processing and Concept Extraction. Speaker: Win Carus, President and Founder, Information Extraction Systems

Exhibitors with a Search Focus:

© 2018 Bluebill Advisors

Theme by Anders NorenUp ↑