What Am I Doing?
Since creating the two tag clouds for green tweeters and the blogs of green tweeters, I have been thinking about other ways of visualizing the data. What I wanted to to do was create a chart or table from the feed data but I was unsuccessful in locating a tool to do this.
Last weekend I came across this article, which discusses manipulating rss data in Google Spreadsheets. I then remembered an article by Digital Inspiration that outlined the steps for using Google Spreadsheets as a feed reader. I decided to manipulate the rss data within Google spreadsheets and try to make charts using the green tweeter data.
How I Created It
First, I would like to say that this is probably an extremely inefficient way to obtain this data and I would appreciate tips on other methods in the comments. I did, however, learn a lot about Google Spreadsheets, spreadsheet formulas and using charts and gadgets. In an earlier article, I pointed out that I was not proficient with Yahoo Pipes. I am even less accomplished with excel, scalc, and Google Spreadsheets functions. My methods may be convoluted but what I learned was valuable.
I did not know what kind of chart I wanted to create. My immediate goal was to see if I could even create a chart with a RSS feed. I decided to use the aggregated green tweeter rss feed that David Wescott created in Yahoo Pipes because I wanted the usernames included.
The spreadsheet I am referring to is publicly available here.
- The rss feed was imported into Google Spreadsheets with the following formula – = ImportFeed(“http://pipes.yahoo.com/pipes/pipe.run?_id=TDWICNg73RGnwBuLyp1_DQ&_render=rss”,). By default, the last 20 items in a feed are imported. This and other variables can be set in this formula. More information about this is available in the documentation for Google Docs. Four fields were imported; Two for the tweets, the date and time, and the url to the twitter post. I’m not sure why there are two fields for the tweets but this may be related to the way the pipe was created.
- After analyzing this imported feed, I decided I wanted to determine who were the most prolific green tweeters per the last 20 posts.
- I created formulas to strip the twitter username from the rest of the field. As I mentioned, Excel is not one of my strengths, so it took me quite some time to figure out how to do this. After spending way too much time on this I determined that the Mid and Find functions used together would do this for me. First, I used the Find function to count how many characters came before the “:” in the field because I only wanted the data before that point. Thank goodness the field contained that because otherwise I’m not sure how I would have done this. The formula is =FIND(“:” ,A2). This locates the : in the Field A2 and gives me the number of characters that comes before that. The number was placed in column D.
- Using the data from the Find function I used the Mid function to obtain just the Twitter username.
The formula is =Mid(A2,Find(“:”,A2)-(D2-1),D2-1). This basically finds the “:”, gets the number, and then subtracts that number minus 1 so the : isn’t included, and returns the username. I actually played with this formula for a long time before it worked.
- Since each user can appear more than once in the list of 20, I wanted to get each unique instance of the username and then a count of how many times that user posted out of the last 20 tweets. The unique instance was easier to determine with the surprise, surprise Unique function. I used =UNIQUE(F2:F21), which gave me a column of usernames listed once. The count of tweets per user was created with the Countif function = COUNTIF(F2:F21, H2). This formula counted the number of tweets in a particular column and then matched the numbers with unique instance of a username from the previous formula.
- The unique instance of the username stripped from the full tweet combined with the count of the number of tweets provided the data used to create the pie chart. The pie chart was created with Insert Gadget > Charts > Pie Chart > Add to Spreadsheet. I was able to select the columns and other chart settings that I wanted to use. I also moved the gadget to it’s own sheet within the spreadsheet.
- To publish the chart, I selected the Publish button, which provided the code to insert on this page. One problem I ran into was that I thought selecting Publish would make the chart dynamic, but when I looked at the chart the next day the data had not changed although the feed within the spreadsheet had. What I failed to do was select the Publish tab on the far right and check the box that says “Automatically re-publish when changes are made”.
- The pie chart is dynamic and will update when the RSS feed is updated in the spreadsheet. I have not determined what the update interval is and if anyone knows I would love to find out. It is also interactive. Clicking on any of the sections will provide you with additional information. Of course, similar to the problems I ran into with my Green Twitter Cloud, the charts rely on a number of services to work properly. Twitter must be up to produce the RSS feeds, Yahoo Pipes must be running to create a single aggregated RSS feed, Google Documents must be working for the feed to be imported and the chart created through the spreadsheet and my site must be up so the chart is displayed online. Whew! I worry most about Twitter, although this week I’ve been having more problems with Google. Although my project may seem convoluted and there is probably an easier way to do this I like it for two reasons. I learned so much and I have control over the data and what is displayed in the chart gadget.
I’ve made some additional gadgets with this green tweeter RSS feed data, which I will discuss in another article. I have also been thinking of a myriad of other uses for this technique. What could I do with other RSS feeds? What other charts could I make? Could I learn to make my own gadget? Could I import the data from my MySql videogame database and create charts from that data?
David Wescott says
OK. Now my brain hurts. I need to take a closer look at this and give it some thought.
It’s useful to see who’s using the channel the most.
It might be cool to see which green tweeter is active at certain times. like @sustainablog tends to tweet in the morning. So if I want to reach him that’s the time to do it.
David Wescotts last blog post – Welcome Back
Kim Woodbridge says
It hurts my brain too – part of the problem is that I didn’t explain it very well. I’m having a hard time writing like I talk … but really the only hard part was the excel functions – if I ever used excel it wouldn’t have been so hard.
I need an editor :-)
I guess the main point is that google spreadsheets allows feeds to be pulled in and analyzed – there’s a lot that can be done with that.
Kim Woodbridges last blog post – Green Tweeter Google Pie