While people in the past used to be terribly worried about hackers etc. breaking into their computer and thus accessing their data, the current (at least gradually progressing) exhibitionism on the social web (feel free to call it ‘web 2.0‘) combined with a status quo of today’s search technology already enables to gain impressive insights not only into user’s private details, but also into users’ behaviour.
Such insights are of great relevance for e.g. the advertising industry as they enable advertisers to ‘efficiently target‘ the users and to supply them with ‘tailored ads‘, minimizing advertising waste coverage.
Not being able to index social networks thus constitutes a competitive disadvantage and thus search engines are willing to pay to be granted access to such data. As the data on e.g. Twitter and Facebook however is changing in “real time” SEs were required to modify the way in which they index data to be able to cope with short term peaks caused by unexpected events (e.g. Hudson river plane crash, Michael Jackson’s death, more >>here<<).
The service TweetPsych for example creates a psychological profile of any public Twitter account and compares it to the others already in their database. This enables the service to identify those traits/issues that are used more or less frequently by the user analysed.
Far less creepy but still interesting, Google also offers a service to help you gaining and combining information from the (social) web. The service Google Social Graph, still a Beta and aiming at developers, makes information about the public connections between people on the Web, expressed by certrain markup languages (XFN and FOAF) and other publicly declared connections, easily available. The service however returns only web addresses of public pages and publicly declared connections between them. The service is not able to access non-public information, such as private profile pages or websites accessible to a limited group of friends.
Google’s statement on the sources for their data doesn’t necessarily mean much as having e.g. a friend on facebook who has fully published and opened his profile for search engines will thus also enable search engines to gain access to certain data from your profile.
“Certain categories of information such as your name, profile photo, list of friends and pages you are a fan of, gender, geographic region, and networks you belong to are considered publicly available to everyone, including Facebook-enhanced applications, and therefore do not have privacy settings. […]”