Search Engines Under the Hood

This week’s thoughts come from the pile of serendipitous reading that routinely piles up on my desk. In this case a short article in Information Week caught my eye because it featured the husband of a former neighbor, Ken Krugler, co-founder of Krugle. I’d set it aside because a fellow, David Eddy, in my knowledge management forum group keeps telling us that we need tools to facilitate searching for old but still useful source code. In order to do it, he believes, we need an investment in semantic search tools that normalize the voluminous language variants scattered throughout source code. That would enable programmers to find code that could be re-purposed in new applications.
Now, I have taken the position that source code is just one set of intellectual property (IP) asset that is wasted, abandoned and warehoused for technology archaeologists of centuries hence. I just don’t see a solid business case being made to develop search tools that will become a semantic search engine for proprietary treasure troves of code.
Enters old acquaintance Ken Krugler with what seems to be, at first glance, a Web search system that might be helpful for finding useful code out on the Web, including open source. I have finally visited his Web site and I see language and new offerings that intrigue me. “Krugle Enterprise is a valuable tool for anyone involved in software development. Krugle makes software development assets easily accessible and increases the value of a company’s code base. By providing a normalized view into these assets, wherever they may be stored, Krugle delivers value to stakeholders throughout the enterprise.” They could be onto something big. This is a kind of enterprise search I haven’t really had time to think about but may-be I will now.
One thing leading to another, I checked out Ken Krugler’s blog and saw an earlier posting: Is Writing Your Own Search Engine Hard? This is recommended reading for anyone who even dabbles in enterprise search technology but doesn’t want to get her/his hands dirty with the mechanics. It is short, to-the-point and summarizes how and why so many variations of search are battling it out in the marketplace.
I don’t want end-users to struggle too much with the under the hood details but when you are thinking about enterprise search for your organization, it is worth considering how much technology you are getting for the value you want it to deliver, year after year, as your mountains of IP content accrue. Don’t give this idea short shrift because search is an investment that keeps giving if it is chosen appropriately for the problem you need to solve.

1 Comment

  1. Thank you very much. That’s quite nice.
    I do quibble with the use of “old.”
    If an application works & does something useful for the organization, then loaded terms like old (bad) & new (good) seriously muddy the debate. Just try taking away a 40+ year old Fortran system from a life insurance company with the justification that Fortran is “old” and C++, Java, .NET, or Python are “new.”
    All I’m harping/ranting/beating on is that software (in the form of source code) is an iimportant organizational resource (it’s not an asset—see below) that SHOULD be at the table with “document management” and “enterprise search.”
    I’m NOT saying that managers/business people should learn to read code. The rules of the business—as buried in source code—need to be far more findable than they are today. It is my point that the operational rules of the business are NOT buried in email, MSWord & PowerPoint documents, all of which are easily findable & searchable.
    As greater context… one of the huge problems with software is how it is handled by the accounting profession.
    In general if you’re a software company, making products for the market, you can capitalize your expenditures into the product. If you’re a consumer (Fidelity, Bank of America, etc.) it’s typical to expense software spending. Consequently there’s huge incentive to treat software “maintenance” (a loaded pejortive if there ever was one) as a current period expense since if an internal project were capitalized, it would look bad for most organizations to regularly write off multi-million dollar “assets” (e.g. bungled software projects). [Google “american lafrance” for a current comment. Hmmm…. no more fire engines & ambulances!?]
    Monday I attended a “Understanding Term Sheets” gig hosted by Boston CPAs. I had a discussion with a CPA who insisted that it was entirely proper & correct to fully depreciate software on a 3 year schedule (I think his point was that software only has a 3 year useful life, obviously a position that I totally reject). How that jibes with the reality that software applications are often in service for decades—particularly as complex eco-systems sprout up around them—is a question still to be answered.
    The ultimate result of such treatment is that “tangible assets” like furniture & leasehold improvements are on the balance sheet & systems (the nervous system of the enterprise) are not. In a global economy where we’re increasingly moving to producing bits with software tools rather than tangible hard goods/assets, such treatment really doesn’t make a lot of sense (in my less than humble opinion).
    Additional players I’ve found in the Krugle (“code search”) space are:
    Codefetch http://www.codefetch.com Hmmmm… website not responding?
    Codase http://www.codase.com 3 languages offered, 11 on the total list
    Koders http://www.koders.com 32 languages listed
    merobase http://www.merobase.com 46 languages listed
    The Krugle site does not offer an easily findable list of languages.
    Merobase’s claim of 46 languages is impressive if true. They’ve certainly got their head pointed in the right direction.
    Goes without saying that one of the huge problems with searching code usefully is the mind-numbing diversity of software languages. Do you know anyone who’s literate in 46 (human) languages? Can you list 46 human languages? I know I can’t.
    – David

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

© 2019 Bluebill Advisors

Theme by Anders NorenUp ↑