Initial thoughts on my new social data flow


Firstly, apologies to anybody who couldn’t see the social data flow diagram in my earlier post – I’ve now hosted it on my own server so it shouldn’t disappear quite so often.

(Click on the image for a bigger version)

The new data flow, using ping.fm to syndicate my status updates and micro-blogs and FriendFeed to aggregate my other activity and consolidate everything into a coherent set of feeds, is working quite well. I would like to be able to have more control over the format of what both ping.fm and FriendFeed do for each target system, but the defaults are adequate for now.

Here are the issues I’m experiencing:

1. Using Twitterfeed to take my FriendFeed RSS and pipe it into Twitter introduces a degree of latency that’s a bit inappropriate for Twitter. Twitter should be near real-time and conversational (whilst not being an instant messaging system), but Twitterfeed only reads my FriendFeed every half an hour. If I get ping.fm to update Twitter directly then I’m going to get some updates twice, which is exactly what I’m trying to avoid.

2. Some of the connected systems (identi.ca for example) are Twitter-like, and I’d like to treat them the same way as Twitter: as an output device for all my activity. But Twitterfeed is Twitter-specific and I haven’t found an equivalent for identi.ca etc. yet. In any case I don’t want a different Social ETL tool for each output system; I want Twitterfeed (or something) to do this for a number of target systems.

3. There’s no class solution for location-based systems yet. Obviously I’d like to be able to update my location and travel plans in one place and have it propagate to TripIt, Dopplr, BrightKite etc. I think FireEagle may be trying to do this but it’s short of two things at the moment: co-operation from the other systems and an invitation for me. If anybody has access to FireEagle I would appreciate a way in.

4. Now I need to do the other thing – deduplicate my profligate friends’ updates. When Myrto uploads a picture to Flickr, for instance, I get notified about it four times. I can rationalise this a bit but I can’t ignore her FriendFeed or Twitter because I would miss some of her updates, so I’m condemned to hearing about her new pictures multiple times unless (a) she adopts a version of my social data flow or (b) I find a solution to deduplicating the same update coming through multiple channels. This duplicate filter is necessary until everybody adopts my architecture, i.e. for ever. Does anybody know of a solution out there?

This all seems quite difficult to manage and well beyond anybody who is either busy or technically challenged. There’s no real way of developing a packaged solution until all the social systems adopt a single sign-on technology like OpenID, and many of them have sound technical reasons for not doing so (OpenIDs can be created by anybody anywhere so it’s a bit like expecting system owners to trust a digital certificate that has no trusted root certification authority).

Anyway, that’s the state of play at the moment. More to come I’m sure.

Advertisements

1 Response to “Initial thoughts on my new social data flow”



  1. 1 Websites tagged "etl" on Postsaver Trackback on August 8, 2008 at 08:45

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s





%d bloggers like this: