*Social Media Services Provide A Rich Resource for Data Mining

While people in the past used to be terribly worried about hackers etc. breaking into their computer and thus accessing their data, the current (at least gradually progressing) exhibitionism on the social web (feel free to call it ‘web 2.0‘) combined with a status quo of today’s search technology already enables to gain impressive insights not only into user’s private details, but also into users’ behaviour.

Such insights are of great relevance for e.g. the advertising industry as they enable advertisers to ‘efficiently target‘ the users and to supply them with ‘tailored ads‘, minimizing advertising waste coverage.

Not being able to index social networks thus constitutes a competitive disadvantage and thus search engines are willing to pay to be granted access to such data. As the data on e.g. Twitter and Facebook however is changing in “real time” SEs were required to modify the way in which they index data to be able to cope with short term peaks caused by unexpected events (e.g. Hudson river plane crash, Michael Jackson’s death, more >>here<<).


The service TweetPsych for example creates a psychological profile of any public Twitter account and compares it to the others already in their database. This enables the service to identify those traits/issues that are used more or less frequently by the user analysed.


Far less creepy but still interesting, Google also offers a service to help you gaining and combining information from the (social) web. The service Google Social Graph, still a Beta and aiming at developers, makes information about the public connections between people on the Web, expressed by certrain markup languages (XFN and FOAF) and other publicly declared connections, easily available. The service however returns only web addresses of public pages and publicly declared connections between them. The service is not able to  access non-public information, such as private profile pages or websites accessible to a limited group of friends.

Google Social Graph should help help users connect to their public friends more easily.

Google’s statement on the sources for their data doesn’t necessarily mean much as having e.g. a friend on facebook who has fully published and opened his profile for search engines will thus also enable search engines to gain access to certain data from your profile.


Since the most recent change of Facebook’s Privacy Policy in December 2009 some data (picture, current city, friends list, gender, and fan pages) is now deemed to be ‘publicly available information‘, which means that users have no way to prevent any other Facebook user from viewing this information on their profile. Thus it is e.g. easy for marketers to create a dummy facebook account and to supply facebook with an email-list of its customers. Facebook then scans the email-list and will as a consequence supply the marketer about his customers with all the information below:


“Certain categories of information such as your name, profile photo, list of friends and pages you are a fan of, gender, geographic region, and networks you belong to are considered publicly available to everyone, including Facebook-enhanced applications, and therefore do not have privacy settings. […]”

This data is furthermore also accessible to the developers of applications used by your friends. That means that you don’t even have to use the apps yourself to allow the developers of your friends’ apps’ so get your publicly available information. An option (FaceBook API opt-out) which could be used to prevent this got removed from Facebook through its last Privacy Policy change. For more information on this issue please refer to the EFF.

5 Responses to “*Social Media Services Provide A Rich Resource for Data Mining”


  1. 1 Fernando Fernández 13/01/2010 at 04:00

    Max: Can I translate this post into Spanish?

    • 2 austrotrabant 13/01/2010 at 07:42

      Sure, sure…

  2. 3 Mark Clayson 29/01/2010 at 09:25

    You raise a good topic here. Thanks so much for your posting.


  1. 1 *Still Unsue which Flat to Burgle This Weekend? Ask Twitter « Austrotrabant's Blog Trackback on 19/02/2010 at 15:07
  2. 2 *How Much Information Does A Search Query Reveal About A User? « Austrotrabant's Blog Trackback on 21/04/2010 at 08:33

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




This Satelite Doesn’t Beep But It ‘Tweets’

Please click here if you want to follow this blog on Twitter.

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 95 other followers

Author’s Rights

Stopline.at

Stopline.at - Online reporting hotline for child pornography and nationalsocialist content on the internet
JuraBlogs - Die Welt juristischer Blogs
Herdict.org

Previous Posts:

RSS WIRED Epicenter

  • An error has occurred; the feed is probably down. Try again later.
wordpress stat

%d bloggers like this: