What is Search Relevancy?
Search is a term frequently used in the web these days. Generally search means the use of a "search engine" to retrieve information based on input from a user. In this sense, search is magic, I type in what I want to find and the software finds it for me. But there's a lot going on behind the curtain and much of this relates to search relevancy.
Search relevancy is simply a set of procedures that are designed with both a user and they type of content in mind, to create results that are useful to the user. Huh?
I'd start with Google.com but its one of the more complex models because of its size and the generality of searches. Besides, its business analysists, taxonomists, marketers, information and knowledge management professionals and others that really need to understand search relevancy - for the end user, it should just be "magic".
OK, so say you're involved in private equity finance, or hedge funds, or some sort of investment banking activity. And you are picked to be part of a "governance committee" for a new knowledge portal. You've been picked for your skills in finance and you're wondering what you'll be asked to contribute. Here are some quick tips about search relevancy and knowledge management or IR - information retrieval.
Search information retrieval is composed of many different activities and processes. Search relevancy is in some ways the first and last piece of the puzzle. Back to our scenario now.
The project for your company is going to go out to a number of information sources and create a library (in the web its called a corpus) of data that will help your finance experts make better decisions.
Relevancy starts with the question:
- What type of data is being searched for?
- What type of input is used to request the information - we'll use "natural language" - in non-computer terms this means, you just type what you want to find - "manufacturing takeover targets in Springfield, MA"
- What is the best result, or most relevant, result set in a given situation
In this case "manufacturing takeover targets in Springfield, MA" there are a number of key phrases here that a good knowledge portal will take advantage of to find relevancy in the corpus and return the best results.
Relevancy and Query Parsing Design
Original search engines would just take the phrase and look across all the data, or sometimes just the "text" fields for matches on the term and individual words. If this was the case with your knowledge management portal, I'm guessing it would only report on past takeovers - not much good to you or your firm.
So relevancy dictates how a search (query) is treated after you hit "search". In this case you'd say to the knowledge portal consultants: "'takeover target' is a pretty specific set of conditions for our company. Just looking for the word 'takeover', may return a lot of information that is either too old or has to do with generalities, not our particular criteria for a takeover."
What you're saying is that in the case of a text search for the word "takeover" you would expect a lot of "false results" or irrelevant results unless this word is "parsed" according to your firm's criteria.
After a bit of back and forth your describe some conditions that your company would look for as a takeover candidate: high inventory, lowered earnings, leadership turnover, and debt ratios.
What a good business analyst will do at this point is create a set of measurable conditions based on the data you're acquiring - this is sometimes called "metadata" because it includes data that the end user sees and data that the computer uses to evaluate the content. Secondly they'll look at if and where this data exists in your corpus.
In this case, the query will look for data not a word or specific meaning and create a way to process these words so that they return records that the user is expecting, not records that simply match text against the user input.
Simply stated, in this case, the query parser would identify the word "takeover" and probably "take over" as well, and know that this word has specific meaning and criteria.
Let's look at how this might work in this situation:
User types in "manufacturing takeover targets in..."
Search sees the word "takeover" - and knows that there is a particular rule for this word
One of the rules is that when the word takeover is seen, the computer searches a field that holds inventory data about all companies in the database.
Our rule says, find all companies where inventory is > X or the ratio of inventory to some other condition is > or < than X.
The next rule searches fields addressing the tenure of all C level executives to find recent exits, or a pattern of short tenures over a period of time.
The next rule searches a field containing debt ratios for a set specified conditions.
Finally the search looks for all companies that have posted lower expected earnings or have disappointed the market by under performing in earnings.
So you see, we've collected specific and measurable data from a variety of sources that MEET YOUR DEFINITION of "takeover" but don't contain the word "takeover" anywhere in the record set.