On the weekend I blogged about the iSentia New Zealand Election Index graphic that I had seen in a few places. In that blog I raised a few questions about how they had measured the mentions. I got distracted yesterday, but today I finally got around to getting hold of iSentia. I was given an email address for the person I needed to speak to. I had a bit of a back and forth via email. but even they didn’t seem able or willing to answer the questions that I raised. They said that the data was generated by buzznumbershq, which is their social media track and analytics tool. The person from iSentia pointed out that they were tracking the mentions of not just handles, but also full names in plain text. Which made the figures even more odd.
Firstly lets recap the figures that we are looking at:
We will ignore the top bit, about media story mentions. I am more interested in the social media stats. So apparently they represent the number of times that the people concerned full names or twitter handles are mentioned during the period of 23-29 August this year. I thought some of them were rather low. So on the weekend when I first blogged about this graphic, I did a quick manual count of the number of mentions that Metiria Turei had had in the 24 odd hours to Sunday morning. She had 110 mentions, including roughly 1/3 retweets. So I took the figure as being 76 actual mentions. Which worked out to around 500 mentions in a week. Well since learning what measures iSentia claim to be using, I think we can include the retweets. So unless there is a massive drop off in the number of mentions she gets during the week, Metiria should be around the 700 tweet mark, based off my quick survey. But I know I need better evidence. So after doing some digging I came across Topsy that offers the ability to see the number of mentions there have been of either plain text search terms, or handles, or both, in certain periods. So I have used that to collect data for the last 7 days from 1300 today, for all of the accounts concerned. Now I know the periods don’t match up, so there will be some variation, but it shouldn’t be too big.
Working backwards we have:
Metiria Turei: iSentia score 228. Topsy Score: 1745.
Laila Harre: iSentia score: 317. Topsy score: 1860
Hone Harawira. iSentia score: 411. Topsy score: 148 (this was done using @MPHoneHarawira as the handle. There seems to be two possible accounts for him.)
Jamie Whyte: iSentia score 633. Topsy score: 413.
Colin Craig. iSentia score: 1307. Topsy score: 1040
Winston Peters. iSentia score: 1335. Topsy score: 2057.
Russel Norman. iSentia score: 2039. Topsy score: 2459.
David Cunliffe. iSentia Score: 2839. Topsy score: 4828.
John Key. iSentia score 14048. Topsy score: 13431.
So John Key’s score is the only one where I can believe that the difference in score, of around 600 mentions, is caused by the slightly different time windows that are being used. But the rest raise questions about the accuracy of iSentia’s figures. David Cunliffe’s figures are out by nearly 50%. Both of these windows include the debate last week, so that can’t be the explanation for the difference. When you look at Metiria and Laila, the figures are even further out of whack. They both have figures that are 6 to 9 times higher using Topsy than iSentia. The Topsy figures also invalidate the claim made by iSentia that the other minor party leaders are getting more mentions than the leaders of the Internet Mana Party. Laila Harre is getting more mentions than any minor party leader, bar Russel and Winston, and she is not that far (197) behind Winston. Hone has never really had a great online presence, and despite being the leader of the Internet Mana Party, I didn’t really expect that to change too much.
It is not clear from the email conversation that I had with iSentia if they are measuring mentions of Facebook as well. If they are, and the figures Topsy are using only come from Twitter, then the figures from iSentia are even further out of whack. If the media and blogs, which is mainly where I have seen this graphic, are to carry it, surely there is an expectation that the information contained in it will be accurate, or at least stand up to some scrutiny? When a polling company, that has signed up to the copy of practice, conducts and releases a poll there are certain expectations on how they will do it and how the data is to be reported. But there is nothing like that here. In fact the only information that iSentia have released is the graphic above. On their website there is no extra information on how it has been put together either. It is only by actively seeking out to speak to them that I have found the extra information. Which still hasn’t really answered the questions. When asked about the fact that the number of mentions they attribute to Metiria being less than the number of mentions on her Twitter feed, iSentia implied that was cause they were also tracking name mentions. How that can result in LESS mentions than she got on Twitter I do not know.
Therefore the only conclusion that I can draw is that the data iSentia is currently collecting is not an accurate representation of the social media sphere. If this inaccuracy is due to bad data collection, bad data analysis or something else, I am not sure. But it still results in the graphic and the data being something that should be handled very carefully and sceptically.