We all know those annoying polls during election season.
The Democrats are winning by a landslide! (MSNBC)
The Republicans are running away with it! (Fox News)
We don’t know what the hell’s going on. (Normal people)
The polls are as ubiquitous as they are inaccurate, almost every year. In the US we have Nate Silver, of course, but despite the 2012 elections, humans do make mistakes.
Machines,on the other hand.
Yesterday St. Louis-based simMachines announced that they used artificial intelligence to accurately predict Costa Rica’s election earlier this month. Traditional polling methods had Luis Guillermo Soli in 4th place, but he actually finished 1st in the preliminary election.
simMachines was able to predict this accurately by using a similarity engine to study the syntax of Tweets about each party. Like traditional polls, they grouped them into classifications just like traditional polls: favorable, unfavorable, and neutral. The study analyzed 12,455 tweets and was able to accurately predict that Soli would finish in first place.
simMachines was founded by Costa Rican native Arnoldo Muller-Molina, PhD and relocated to the United States as part of the St. Louis Arch Grants program. Muller-Molina wanted to conduct the study as a celebration of democracy in his home country.
“The technology was already in place. While some people sing or dance to celebrate, we are data scientists so we decided to analyze the social space,” he said.
That technology isn’t only good for predicting elections, though. Any data can be put through the system: Tweets and other social media, blogs, media publications, and some servers. Brands can use technology like this to more accurately track the tone consumers to use when talking about them or their product.
The company compares similarity search to a “Swiss army knife,” which can be used in a wide range of data science projects.
With interest in data growing, similarity search and other data technology has all the room in the world to grow.
As Nick and I like to tell each other, “If it’s based on data, let’s go with that. If it’s based on opinion, let’s go with mine.”