Charles Young

I know I'm a bit of an MS fanboy at times, but please, am I missing something here? Microsoft, with permission of users, exploits clickstream data gathered by observing user behaviour. One use for this data is to improve Bing queries. Google equips twenty of its engineers with laptops and installs the widgets required to provide Microsoft with clickstream data. It then gets their engineers to repeatedly (I assume) type in 'synthetic' queries which bring back 'doctored' hits. It asks its engineers to then click these results (think about this!). So, the behaviour of the engineers is observed and the resulting clickstream data goes off to Microsoft. It is processed and 'improves' Bing results accordingly.
What exactly did Microsoft do wrong here?

Google's so-called 'Bing sting' is clearly a very effective attack from a propaganda perspective, but is poor practice from a company that claims to do no evil. Generating and sending clickstream data deliberately so that you can then subsequently claim that your competitor 'copied' that data from you is neither fair nor reasonable, and suggests to me a degree of desperation in the face of real competition.   Monopolies are undesirable, whether they are Microsoft monopolies or Google monopolies.    Personally, I'm glad Microsoft has technology in place to observe user behaviour (with permission, of course) and improve their search results using such data. I can only assume Google doesn't implement similar capabilities. Sounds to me as if, at least in this respect, Microsoft may offer the better technology.

[UPDATE]...and here a few links that support a similar perspective or at least offer an even-handed appraisal.  For the most part, they are clearly written by people better informed than me.  Incidently, I hadn't read any of these links before posting - any similarity really is coincidental.  I wasn't copying stuff from Google searches :-)

posted on Friday, February 4, 2011 7:26 PM


# re: The Bing Sting - an alternative opinion 2/4/2011 9:58 PM Nick Harrison
I don't know about it necessarily being a problem with copying data, but the problem that I have with it is that they did not "improve" the data. They did not process it, they did not filter it. They did nothing but copy it unchecked.

Use the so called "click stream" data, but look at it before you blindly incorporate it into your results.

Google used this to show that Bing copied their bogus data. What's to stop a clever hacker from doing something similar to distort page rankings.

Don't trust a data source without validation regardless of who the source is

# re: The Bing Sting - an alternative opinion 2/4/2011 10:56 PM Brian
You are absolutely 100% correct - you're a Microsoft fan boy and not thinking clearly about this issue.

The Bing '1 in a 1000' filter explicitly looks for Google specific queries as part of the click stream. Think about that for a moment, the Bing algorithm contains code that will *only* works when a user goes to the site and makes a search POST and then hits the 'result' link - all parsed by Bing under the 'observing user behavior' banner.

Bing's weighting is changed off of the back of the Google algorithm. If Google didn't exist (or find a result) then the Bing algorithm would be poorer. That honestly sounds fine to you? Would't the easiest thing in the world for the Bing team to just deny this (they've been very careful never to do that).

Also, go install IE8 and click the 'Yes please, suggested sites sound fine'. Now go ask your mother if she knows she just agreed to the above behavior. 'User streams' - fine, all toolbars (and browsers, I guess) can do that but 'competitor specific code stream', well, that at least sounds more ambiguously moral to me.

Anyway, not sure why I'm commenting here, as this is really a religious debate where normally intelligent people are now drolling for their 'team's side'. I'm saddened by people's obvious partisan reactions and expected better.

# re: The Bing Sting - an alternative opinion 2/4/2011 11:53 PM Charles Young
@Brian. I agree re. 'religious debate' and hope I'm not simply adopting the position of a fan boy - hey I use Google all the time. This issue is causing such a fuss, and I want to offer a different perspective because I really do think Google are taking cheap shots at a competitor. They should (and I suspect do) know better. Clickstream analysis is a fact of internet use and I don't see that Microsoft is adopting fundamentally different practice to other companies (including Google) in using it. Nor do I see that clickstream analysis is a big problem provided that all data is anonymised and aggregated, and that users are given the chance, at the very least, to opt out (I prefer opting in).

So, we are left with the fact that, as you correctly point out, Microsoft must presumably be detecting a complex event that consists of a Google search and a subsequent click on a result. They are probably doing the same with searches against other providers. My point is that Microsoft is doing nothing more than what they state - observing user behaviour (in this case search behaviour) and using these events to improve their search results.

I don't find it hard to understand why Google has been so successful in claiming that Microsoft's behaviour is wrong. But is it? Really? The reason the Internet grew so huge is that it is an open environment where huge amounts of data are shared freely. Of course, people can secure data if they need to. They can opt out of sharing data if they wish (e.g., preventing search provider crawlers from indexing their content, although this is done on trust). What is so wrong with Microsoft exploiting freely available information about the searches people do? They are, I imagine, simply observing that when people search for specific keywords, they generally click on certain links. Hence they give those results a higher ranking in Bing. This seems eminently sensible. The issue is that they are observing searches on other search providers. Well, hey, that's the freedom that the Internet offers.

@Nick: Given the fact that Google had a team of engineers repeatedly typing in the same keywords and clicking the same results, it is hardly surprising that Bing started offering the same results for the same keywords. This isn't evidence that their ranking algorithms are poor. It doesn’t constitute 'copying' of data in the way that is being suggested. It just means that their clickstream analysis saw lots of people selecting the same result for a given search and amended Bing accordingly.

It's in my best interests that Microsoft and Google compete head on. I frankly don't care who 'wins'. I want the best search facilities possible. I use multiple search providers (currently mainly Google and Bing), and regularly (several times a day) run the same searches against both engines. My one complaint about IE9 is that it is harder to do this than it was in IE8. I have no clear favourite between the two, though I think Google's maturity does still count a bit over Bing, and I especially find that Google tends to find and index new content much faster. I use both providers because their search results, although often similar, are also often complementary. I sometimes find that one provider returns almost nothing whilst the other returns lots (it's fairly evenly split between the two), and the ranking is often weighted noticeably differently, which rather suggests that Microsoft's use of Google search event data is not a dominant feature of their ranking algorithms. Google might do worse than use analysis of Bing(Yahoo) searches to improve their own algorithms. I don't see why anyone should mind. The Internet is free.

