Flipping through the German blogosphere yesterday I came across a rather specialist IT-website which explained in length and at great detail why it is actually impossible to register the location on wireless networks (WiFi, WLAN) without at the same also monitoring some of the user information (parts of emails, or websites etc: ‘payload‘) transmitted. The post by Kristian Köhntopp referred to the standard used for Wireless Networks (IEEE Std 802.11) and (I hope I got that right) explained that due to the nature of wireless networks there are different types of information packages (frames) transmitted simultaneously. While some of these frames contain information about the wireless network itself (beacon), others contain the actual information (payload). The beacon frames, however, are the ones relevant for Google as they contain the MAC address and the ID of the network.
When trying to scan for wireless networks in a city one is however facing not one separate wireless network in the middle of nowhere, emitting and receiving in a steady and continuous stream between two points using a distinct frequency, but a multitude of sources (some might only be interferences) wildly broadcasting on the same frequency. So what Google did was to take a sample (let’s compare it to a picture taken of a group of friends at a busy square with your camera) of all the network information that was broadcast on a certain frequency for a short span of time (0,2 sec). Staying with the example of the square, you might only have wanted to take a picture of your posing friends, but as you also wanted to enrich your holiday snapshot by adding the imposing image of Vienna’s St. Stephens Cathedral in the background, you inevitably also photographed some bystanders, who just happened to also be running around at that time (if they’re moving quickly, they aren’t Austrians ;) )
Coming back to the wireless networks, all the Google StreetView cars did was collect the images and WiFi samples, which were later uploaded and processed by some the corporate super-computer (‘the mothership‘). Otherwise I would have to assume there would be Google Streetview trucks driving through our cities, hosting computers which are not only capable of instantly blurring the faces of people and (some) writing of all the taken pictures, but also a bunch of college students overseeing this process and immediately deleting information by hand if the computers had missed something. So, I think we are safe to assume that all the data collected by the cars on the street is later processed by Google. But as explained above, before this can be done, the unmodified data (thus unaltered samples) first have to be collected and uploaded. And only this process turns the raw bundles of data collected into a valuable source of information. That’s also what another company, Skyhook, has also already been doing for some time (which claims only to record the ‘ “Here I am” signal ‘.
Why? Because knowing the exact location of wireless networks (as well as of other sources such as GSM-towers etc.) helps to improve the precision of mobile devices. Currently most handheld devices rely on GPS data, which is good, but not overly accurate and sometimes not available. The wireless network of a small coffee shop however is perfectly suited to inform a handheld device about its position, as the signal of the wireless network cannot only be perfectly attributed to a certain location, but can also still be received right in the middle of a multi-storey shopping centre which, due to its steel and concrete construction, is perfectly shielded from any GPS signals.
And, as this a blog about advertising, you will not be surprised at this point to hear that this information is valuable not only for the owner of the hand-held device, who will be pleased to be able to be able to find himself on the map of the shopping centre, but also for advertisers as it enables users to be exposed to the advertising of a company (ad about discounts in a shop) right before the customers has reached its shop window.
So… my dear German friends, Google is not scanning your mail (they could do that much easier by just taking a look at the emails you or your best friend are storing on their Google Mail servers), neither are they interested in which clips you are currently watching on YouTube. It’s all about advertising and advertising revenues; not about user information.
You might add now that a Google is not your friend and thus you don’t want to be photographed by Google anyway. That might be true, however as soon as an individual (at least in Austria) steps out of his flat/house he can’t object to (unintentionally) becoming a part of the picture taken of his street. And, returning to Google Streetview, you won’t be surprised to hear that Dennis Schultz at an event (Security 09) held in Vienna last year explicitly said that it required Google quite some effort to blur faces and writings on all the images, as people on the images actually only disturbed the pictures. We therefore conclude that Google is more interested in the colour and the shape of the post box on the wall of a house than in the person standing in front of it.
If my thoughts weren’t too cryptic you should at this point ask yourself; ‘but haven’t they already admitted that they’ve made a mistake by using this piece of code which…?‘ Yes, I think it was a mistake that Google has not deleted the ‘unwanted side-product‘ which they had previously harvested while taking samples. But on the other hand the fact that Google did admit this just shows in my eyes that they have really investigated in great detail all the required issues in the catalogue of questions by German data protection authorities and secondly that they, and this indeed impresses me personally, were honest enough (had the guts) to openly fully admit to a mistake which maybe nobody would have discovered or which at least nobody would have been able to prove.