Pedro Jacobetty, The University of Edinburgh

virPrague 20: Big Data, Information Sciences, Technosphere

This project started with a desire to understand what people were writing about blockchain technologies. It uses the topic modelling technique – a machine learning model for natural language processing (NLP), useful for analysing a great volume of documents. It’s objective is to uncover latent structures of meaning (semantic structures) that organise the distributions of words (what we commonly referred to as topics) present throughout the documents. In this case, the model was trained on 10,000 posts from the blog platform Medium, categorised by the authors with the tag blockchain (a graphical visualization of the model can be found here: http://zeroknowledge.tk/lda.html).
The Soundchain music generator monitors online discourse about blockchain technology in real-time using the Twitter streaming API (filtered by the hashtag #blockchain). Tweets are interpreted in real-time by the previously trained topic model.
The genesis tweet (first tweet captured when the generator starts) is used to create the melody and the rhythm. I mapped the identified topics to some parameters of the music generator (e.g. financial technology and cryptocurrencies increase distortion, technological development increases BPMs), transforming the Twitter stream into a never-ending sound piece.