by dr.k.conor on Sat Oct 23, 2010 7:00 am
The 'bugger' of search engines is often as follows:
Part-1: the programmed engine itself, which can be original programmed script or clustering: collecting other SE results. Sometimes this can also be subject and language domain limited.
Part-2: the data, information resource; this has an effect on the SE methodology; best in semantics, best in graphics, best in scholoarly resources. Wikipedia is both as useful reference and a misleading 'blog'.
Part-0: not usually considered, but refered to here:
> a return of a 'semantic-string' that is NOT within the desired subject-topic's domain; superman /=super+man
> return of the 'usual' definition: aging is considered my most as a 'biological' process, but is does include chemical changes, both desireable and non-desireable: 'aging' of grapes to wine, 'aging' of the wine to vinegar. If the internet existed for young Albert Einstein, he would have very little relevant data/information returned for quantum relativity; he was ahead of his time, meaning the data and data-links did not exist to give a return on the query. The counterpart to this is that what you want to research is 'not acceptable' such as TaiChi- and Lao-tse and Chang Sanfeng are unrelated.
> return of nonsense or return of misleading sources; such as the Paris Hilton archives for the Hilton Hotel, etc. Thus
Google, even advanced Google, may be able to shave down the returned information, but still suffers from the weakness of the sources searched. Such as theory of everything may through semantics or through deliberate 'bombing' of info providers return everything you did not want to know of christianity and devine creative forces.
Part-00: Mentioned above is the CSE, or custom search engine creation, supported by Google, with this you can delimit or stack the 'prefered' references..you would want someone to see. The weakness is that you will be out of date before you finish and that the site-links you use may be broken or obsolete. Also, the links may have been absorbed into a 'new' owners search tools and corrupted or that the new owner wants $$ to view them. Thus, speed of change and stability of the data comes into consideration.
>>Personally, I use a customizable Browser with either: Exalead, AllTheWeb, GigaBlast and create my own definitions and glossary.