Open Data

Now I know what you’re thinking: “Whoa I didn’t think Mike was actually serious when he said he was going to post every two weeks.” At least that’s what I think you might be thinking…for all I know, no one reads this blog. Ah well, such is life. I’m gonna write anyways, if you don’t mind!

The next thing you might be thinking is “What is ‘Open Data’?” Open Data is information that is available to anyone that wants it. Wikipedia is a perfect example: it’s free, it’s easily available, and it’s (mostly) true.  Right now, the largest and most obvious use of open data is to look stuff up.  That’s a relatively straightforward use for it, but it makes you wonder, what else could you do with open data?


Back when I was in middle school, the computer lab admins put severe controls on the public-use computers around school. None of the games usually installed on Windows (like Solitaire or Minesweeper) were available to use, and you couldn’t even visit Facebook.  Wikipedia, however, was not blocked. To entertain ourselves, my friends and I used to play “The Wikipedia Game”. If you’ve ever watched “6 Degrees of Kevin Bacon”, you’d easily pick up this game as it works the same way. You pick a random starting point for a page on Wikipedia (let’s say Squash, the sport) and then pick a random ending point (Emperor Penguins), and the first person to find a direct path between the two on Wikipedia by only clicking on page links won the game.  Unbeknownst to us children, this could be turned into an actual game.

One of my colleges in the Game Innovation Lab had that thought and came up with DataVentures, a video game based on open data.  It’s a murder mystery game, built a very similar way to the Wikipedia game.  You can decide on a victim (it can be anyone in Wikipedia), and the system then builds a network of clues and suspects using links to articles about people, places, and things related to the victim. You then play by flying around the globe visiting different cities and speaking with people, or inspecting items that might give you clues.

As of now, the game does just that, but the immersion of the game is still missing.  As a player, I struggle to really feel like these people I’m talking to are little more than constructs developed by the game engine meant to give me clues. So that’s where my work begins.  I’ve managed to add more dialogue to the characters, to gain a sense of “person-ness” to them. As of now, only simple questions exist, like “Where were you born?” and “What are you known for?” Later I plan to give each character a personality, which is dynamically calculated based off their Wikipedia page.


The basic idea is using Wikiquotes to analyze character bios as well as statements said by these characters  to create a very basic personality for an individual, and then applying that personality like a web over that character’s dialogue tree. It would allow a level of differentiation not yet available in the game. For example, Albert Einstein’s quotes might register in the system as “scientific” whereas Dr. Frankenstein’s might be “angry”.  When you ask Einstein what he had for lunch, he would use a very precise description: “I had two slices of bread, with a mayonnaise glaze, two slices of honey baked ham, and one sliver of American cheese aged four months.” If you ask Dr. Frankenstein the same question, he might answer differently: “I had a sandwich, you dimwit. Why are you asking me this?”

Why do you want to do this?” The straightforward answer is to see if it can be done.  The challenge is twofold. One, applying sentiment analysis won’t be easy.  The way sentiment analysis works won’t give us a completely accurate representation of the subject.  It will give accurate representation of what people think of the subject, and this is a little different from what we want, but it will have to do.  What will be tricky is using sentiment analysis over such a small amount of input data.  Some characters won’t have quotes, or the amount of data available on them won’t be very large.  We will have to see how this will factor into accuracy of the sentiment analysis.  Maybe only some characters will be given personalities if they have “xyz” amount of data points on them, or some metric like that.  Two, I have to build this emotion into dialogue.  I figure the easiest way to do that is to have pre-created words and phrases associated with certain emotions, and apply them where applicable in the dialogue for the correct personality. If you have any ideas, please comment below!

The more complicated answer to “Why do you want to do this?” is to see if we can make the game more realistic than it is now.  We want to push the bounds to see what we can create using only open data as an input.  Can you gain more than knowledge about a person from Wikipedia? Can you gain a sense of what talking to them might have been like?  Those are the kind of questions we would like to answer, and this is how we want to try to do so.

Eventually I’d like to build more on this game. My research interests extend into education, and I think this could be a ripe opportunity to include a learning element in the game. Characters could discuss their accomplishments with the player (if they are curious).  For example, Einstein could explain some of his theories if the player asked about them. No, it would not influence the overall gameplay, but the player could learn something new while they played if they wanted to.  My motivation for this arises from the original Wikipedia Game. Sometimes while playing, I’d discover a page I’d never seen before that piqued my interest.  As I rapidly read through it looking for hyperlinks to click, I’d learn something new.

I think there’s something more to open data than it gets credit for. It can be more than just an encyclopedia of knowledge.  But it’s up to us to be curious enough to explore different uses.  My research team will be writing a paper on this topic sometime in the next few months, and I will post as soon as it is available. Thanks for reading!


