On 29 October 2019 I gave a short presentation at PyData Manchester and Open Data Manchester joint meetup on the topic of Data Horror Stories. My talk was a data-driven exploration of the massive inflatable Halloween monsters of Manchester.
Halloween is the best
Boys and girls of every age Wouldn’t you like to see something strange Come with us and you will see, This our town of Halloween - The Nightmare Before Christmas
I friggin’ love Halloween! I get really into it. Last year, we organised a Halloween themed all day #rstats conference/meetup/thing, ( you can read all about that here), which involved some fantastic pumpkin carvings:
…and me dressed up as the broom package
(far right, with a broom and some functions taped to me…!)
When I first moved to the UK 9 years ago, it was not really a thing at all, but more recently, it has become adopted as a fun and spooky holiday for all the age groups. In Manchester for example there are now all sorts of Halloween related activities. One of these is a set of inflatable monsters which are dotted around the city centre. You can download a map to hunt for them all here.
Naturally, the monster walk was something very appealing to me, and we set out to walk the monster walk, take nice photos, and then pick our fave monsters over some beers. In our household, the hands down winner was “Blob”:
But what about the rest of the city? Which was Manchester’s favourite monster? This was the question I set out to answer in this talk
image credit: [@OpenDataManchester](https://twitter.com/opendatamcr/status/1189253327550406657/photo/1)
One way to gauge what monsters people are photographing and sharing is to look at Instagram. I found two key hashtags that were relevant:
I wanted to select posts that used both hashtags, because a lot of what was coming up with just one or the other was actually not monster related content (at least not in the sense that I was after).
To acquire a set of photos with these hashtags and some of their metadata, I used the instaloader tool. Specifically to get only posts that had both hashtags, I modified this bit of code by aandergr. My version can be found on github here: https://github.com/maczokni/halloweenMCR. This was the only bit of Python I used however, and then I swiftly read my retreived JSON into R.
After some cleaning I had a nice bit of data with some Monster photos and associated metadata. However, none of these told me which monster is in each photo. So this required some manual coding, where I looked at each photo, and coded what monster I saw.
Finally, after all this was done, the results could be considered
And the winner is…
I was working in the lab, late one night When my eyes beheld an eerie sight For my monster from his slab, began to rise -Bobby Pickett - Monster Mash
So finally we can get to some results.
First I considered number of posts:
Well it seems like this round has been won by the dragon who lives atop the Printworks. Okay…
What about the most likes?
Yess, Blob back in the lead!
But this measure is still weighting the number of photos taken, as more photos mean more likes. What about likes per photo?
What is this?! Well in this case it seems I get as winner something I tagged as “fake”. While it is definitely not a current monster (currently in its place are the “orange eyes”), after further investigation, I think maybe it is not fake but an image from last year. This is the image in question.
In either case, since I filtered for images in 2019 only, it should not be there, and is therefore DISQUALIFIED.
So intead the winner is….
…BLOB! What a champ
All is well that ends well
In conclusion, it has been a fun exercise to play a bit with the Instragram API and see what sorts of information I can get out of it. Number of likes, also replies, and the URL to the photos. I want to explore more.
I also noted that there is this “may contain” feature, which has some sort of image recognition application to help describe posts for those with visual impairment. I used this to query some dog photos for example (everyone loves dog photos!). A simple string contains search and boom, I hav dog + halloween monster photos!
dog_pics <- tagged_monsters[grepl('dog', tagged_monsters$may_contain),]
There is much more to explore though; originally I was hoping to get information about what filters people use, based on this paper about instagram filter choice being able to diagnose depression, but I did not get this info with instaloader. I guess I will keep exploring what is out there.
For anyone interested, all my code for this is on my github page.