Archives for

Search Engines; They’ve Been Around Longer Than You Think

It dates me, as well as search technology, to acknowledge that an article in Information Week by Ken North containing Medlars and Twitter in the title would be meaningful. Discussing search requires context, especially when trying to convince IT folks that special expertise is required to do search really well in the enterprise, and it is not something acquired in computer science courses.

Evolution of search systems from the print indexes of the early 1900s such as Index Medicus (National Library of Medicine’s index to medical literature) and Chemical Abstracts to the advent of the online Medical Literature Analysis and Retrieval System (Medlars) in the 1960s was slow. However, the phases of search technology evolution since the launch of Medlars has hardly been warp speed. This article is highly recommended because it gives historical context to automated search while defining application and technology changes over the past 50 years. The comparison between Medlars and Twitter, as search platforms is fascinating, something that would never have occurred to me to explore.

A key point of the article is the difference between a system of search designed for archival content with deeply hierarchical categorization for a specialized corpus versus a system of highly transient, terse and topically generalized content. Last month I commented on the need to have search present in your normal work applications and this article underscores an enormous range of purpose for search. Information of a short temporal nature and scholarly research each have a place in the enterprise but it would be a stretch to think of searching for both types via a single search interface. Wanting to know what a colleague is observing or learning at a conference is very different than researching the effects of a uranium exposure on the human anatomy.

What have not changed much in the world of applied search technology are the reasons we need to find information and how it becomes accessible. The type of search done in Twitter or on LinkedIn today is for information that we used to pick up from a colleague (in person or on the phone) or in industry daily or weekly news publications. That’s how we found the name of an expert, learned the latest technologies being rolled out at a conference or got breaking news on a new space material being tested. What has changed is the method of retrieval but not by a lot, and the relative efficiency may not be that great. Today, we depend on a lot of pre-processing of information by our friends and professional colleagues to park information where we can pick it up on the spur of the moment – easy for us but someone still spends the time to put it out there where we can grab it.

On the other end of the spectrum is that rich research content that still needs to be codified and revealed to search engines with appropriate terminology so we can pursue in-depth searching to get precisely relevant and comprehensive results. Technology tools are much better at assisting us with content enhancement to get us the right and complete results, but humans still write the rules of indexing and curate the vocabularies needed for classification.

Fifty years is a long time and we are still trying to improve enterprise search. It only takes more human work to make it work better.

Read More