WeWork Paddington, London, W26LG | info@artimbarc.co.uk



  • cassidyuggla

Exploration of Artificial Intelligence and Art - Making Art Reviews More Accessible

Updated: Jun 12, 2018

Last time we took a look at some of the recent applications of machine learning in the cultural sector and how these companies and institutions have used these tools to improve the accessibility of art allowing a wider audience to appreciate them.

All the examples that we listed used technology and images of some kind. A form of machine learning, known as deep learning (specifically convolutional neural networks) have been very successful at classifying images (check out an example of this with our bird plane model here) and it just so happens that they are also very good classifying text. Spotify leverages this when it recommends you music by trawling through music reviews in order to connect the dots between different songs. Not only using your and your friend’s user history to recommend new songs.


GLoVE is a deep learning algorithm for obtaining vector representations of words (for more details and a better description see link here). The algorithm is trained on a vocabulary of over a billion words from Wikipedia, Twitter and several other sources. The vector representation of each word is calculated by understanding the context in which that word is found within the corpus of text. If two words appear in a similar context often they will have “similar” vector representations. Mathematically, this means that they are close to each other in their vector space.

For example, if we take a piece of paper (our 2-d vector space) and draw four dots on it - one we call the “origin” i.e. (0,0) and the other three are our “words” as represented by vectors relative to the origin. Let’s say our words known as A, B and C have respective vector representations (as defined by GLoVE) of (0,1), (1,1), (2,3) relative to the origin. We can calculate the distance between them easily, using some simple maths.

Just by eye we can see that the distance in our example between A and B is much shorter than A and C or B and C. This means that A and B have been found in more similar contexts within the corpus of text that GLoVE was trained on than any other combination - so in some way we can say that A and B are more “similar”. There are many use cases of this algorithm but we wanted to explore one that was specific to the cultural sector and one which had the potential to make content more accessible to visitors of art galleries.

There are visitors to art galleries out there who can see the beauty of the works surrounding them but do not fully comprehend the thinking or the artist’s “idea” behind each of them. Galleries have tried to bridge this gap with exhibition guides and leaflets but these sometimes leave a visitor without a History of Arts degree more confused than before. Perhaps he is thinking to himself, “I know what the word ‘atmosphere’ means in the real world but what about the art world?!”. What if we gave him 5 words that were used in the same art-specific context - would that help?


The purpose of a thesaurus is... "to find the word, or words, by which [an] idea may be most fitly and aptly expressed" said Peter Mark Roget, architect of the best known thesaurus in the English language, Thesaurus of English Language Words and Phrases.


We took as our corpus of words (i.e. the possible outputs of the thesaurus) a relatively small number of art reviews found at various reputable sources. Then we gave the model a “target” word; this is the word we want see context for. Each word can be mapped to a vector by the GLoVE algorithm described above and then we can use some slightly more complicated maths than above to find the 5 most “similar” or “closest” words in the corpus to the target word. In the sketched example above our vector space had two dimensions but in this example our vector space has 100 dimensions (not be easily represented in a diagram as we live in a 3 dimensional world!).

If you are interested in the code we used check out our GitHub link here.


We chose 5 words and ran the algorithm for each word to find the 5 nearest words in the corpus of art review(s). For the first go we used 1 art review which had 339 distinct words in it. The second we used 10 and the final we used 50 which had 8398 distinct words in it. This is what we found:

Corpus - 1 review — vocab size 399

Nearest to cognitive: dissonance, social, empathy, bodily, rational,

Nearest to analytically: convincingly, rational, dissonance, precisely, rightfully,

Nearest to sculpture: art, gallery, exhibition, marble, sculptural,

Nearest to atmosphere: environment, sense, very, light, this,

Nearest to interpretation: explanations, particular, text, contemporary, facts,

Corpus - 10 reviews — vocab size 3207

Nearest to cognitive: physical, psychological, processes, visual, therapy,

Nearest to analytically: stylistically, infinitely, contextualised, coexist, unpredictably,

Nearest to sculpture: sculptures, painting, art, paintings, exhibit,

Nearest to atmosphere: environment, cool, surface, tension, kind,

Nearest to interpretation: interpretations, context, biblical, explanation, narrative,

Corpus - 50 reviews — vocab size 8398

Nearest to cognitive: mental, physical, psychological, learning, spatial,

Nearest to analytically: stylistically, infinitely, maddeningly, analysed, cunningly,

Nearest to sculpture: sculptures, painting, art, paintings, exhibit,

Nearest to atmosphere: climate, environment, cool, earth, reflection,

Nearest to interpretation: interpretations, context, description, contrary, argument,

Chart showing that words found by the algorithm are more “similar” for the larger corpus

And then we compared it to THESAURUS.COM:

Nearest to cognitive: emotional, intellectual, mental, subjective, cerebral

Nearest to analytically: empirically, on trial, provisionally, temporarily, on probation

Nearest to sculpture: sculpt, carve, cast, chisel, cut Nearest to atmosphere: air, pressure, envelope, heavens, sky

Nearest to interpretation: analysis, clarification, explanation, judgement, meaning


Even from this small collection of art reviews we can definitely see a progression of learning as the size of the corpus grows. The thesaurus feels more ‘useful’ when 50 art reviews are used; e.g. words like ‘this’ (from atmosphere (1 art review)) are filtered out as the model has more data to learn what is important in the context of art reviews.

The similarity of the target word and the output words of the model increases as the corpus size increases. This can be seen in the graph above for the target word “interpretation”. Also, the relevance to a gallery visitor seems higher in our model than hand-picked from the classic thesaurus; for example, context vs. clarification (target = interpretation), stylistically vs. empirically (target = analytically).

Now, returning to our visitor who is unsure of what the exhibition meant when he read ‘the prevailing atmosphere is tenebrous’ (a quote from an art review of a National Gallery exhibition). He may now be more comfortable (and potentially more positive?) when talking about the exhibition with his friends over a beer that evening! Removing the occasionally intimidating barriers of academic art speak can only be a good thing for the sector and word of mouth from these newly enfranchised visitors will help drive the audiences that they are looking for.