Kotodama: The Power of Words
Frequently Asked Questions

How did Kotodama begin?

In fall 2003 while we were graduate students at the Entertainment Technology Center (ETC) at Carnegie Mellon University Joshua Taylor and I wrote a research paper titled: I.W.I.L.A. Intelligent Worlds of Interactivity for Language Acquisition. In it we sketched out a methodology for language-learning where students could play with language as they learned it, making it fun, relevant and reactive in a videogame context. We grafted an intelligent tutoring system onto a videogame structure whose main throughput is language. Many similar research projects had gone about this in the reverse: starting with traditional language curriculums and simply giving them a videogame-like veneer. We wanted to create first and foremost a fun videogame wherein a players level of language fluency equals her power and agency within the game.

Excited by the reception of our research I organized several students with a diverse range of skills to start building a playable prototype. After getting approval from the ETC faculty-board, Sam Hart, Yuki Izena, Sabrina Haskell and I began working in fall of 2004. We asked two language experts from CMU to advise our group, Professor Christopher Jones and Professor Sono Takano-Hayes. We also recruited Professor Tina Blaine from the ETC to advise us, whose research involves building novel musical and game interfaces, such as the Jam-O-Drum. In a semester we created a prototype of a Japanese Language Learning Experience which we called: Pettochan: Friend Pet. We used the open source game development game engine Panda3D developed by Disney and scripted it in Python. To create the content we used Maya and Photoshop for the visuals and Pro Tools for the sound.

 

What is Pettochan: Friend Pet?

We geared Pettochan toward early teenage boys and girls who have an interest in anime and/or Japanese culture. Players didn’t need any previous knowledge of Japanese to play it, even though the game is full-immersion without any English aids. There is a mentor character that speaks and mimes out what the player should do to learn each new mini-game. Pettochan is inspired by Tamagotchi except it is one that lives on your desktop in the corner of the screen, instead of in your pocket. When the student wants to play with it, she could simply click on her pet and the next mini-game begins. When the player loads Pettochan for the first time she’s given an egg to care for that will eventually hatch and evolve into more advanced forms as she masters the mini-games. The games are played by clicking words in a molecular menu system that springs out of the pet and floats on screen. The menu options move so the player can’t simply memorize their location. In Pettochan we concentrated on teaching aural, visual(textual), kinetic and contextual recognition. (Not to be confused with our other project, Kotodama, Pettochan doesn’t use speech recognition).

The vocabulary for Pettochan complimented the menu constraints and was divided into sets, one for each mini-game. For example, there’s a DDR-like game that teaches movement and colors by having the player make the pet dance, there’s another game about running around the screen and commanding your pet to gorge certain foods. User tests at a local high school proved extremely useful –mostly helping us tweak some pacing issues. At the end of the semester we were inspired with our results but wanted to explore how we could leverage virtual 3D environments (as opposed to the 2D environments that we had utilized up till then).

 

What is Kotodama: The Power of Words?

In spring of 2005 three more students joined our group, Yi-Hong Lin, Ji-Young Lee and Charles Brandt. We decided to start a new project, Kotodama: The Power of Words. We shifted directions from where we left off with Pettochan. We started production of a new 3D game from scratch. We integrated speech recognition and scripted a darker storyline. Our target demographic was high school students interested in videogames and anime. We stuck with the Panda3D engine and successfully integrated it with Julius, an Open-Source Large Vocabulary CSR Engine, for speech recognition (developed at Kyoto University).

We thought a role-playing game (RPG) would facilitate language learning because of the types of learning inherent in them. The aspects of RPG's that interested us in particular were: exploring strange environments, interacting with interesting characters, rising battle challenges and accruing items, spells and experience points. We tried to culminate all of these into Kotodama making each relevant to language usage, listening-comprehension, and language synthesis as the core mechanics of play.

The player begins the game by being transported through a magic book onto an alien world called Inuboshi. To imagine Inuboshi, think of the Little Prince on a small Japanese-inspired planet (including it’s own version of Mt. Fuji). With a PS2 style controller the player can walk their avatar, a young boy, around the planet. There is an old dog-like (bipedal) mentor character named Doguemon that teaches the player the basics, such as how to point at anything and say, “What is that?” (in Japanese, of course) after which the object will speak its own name. The player is taught verbs by Doguemon and can combine words as she wishes. For example if a large rock is in the way or is needed for battle the player can point at the rock and say, “[The] rock rises!” And it will lift into the air to await further instructions. In Kotodama the primary source of agency is the player’s voice.

When we tested this with students they were extremely excited to discover a (potential) game where all the hours of playing and memorizing its world would actually have direct correlation and value in the “real” world. As a caveat, Kotodama was never intended to replace classroom learning, on the contrary, it would encourage classroom learning because the students would literally be “powering-up” as they learned new material in class. After school they could test these new powers by playing the game. Suddenly, their studies and play would have a co-relevance with one another, encouraging student engagement with both in a kind of symbiotic play-learning.

 

How well did the speech recognition work?

The speech recognition works “most” of the time and the software tends to respond better when the players speak loudly. A common glitch was when players paused in the middle of a word or phrase. After a few minutes of trial and error most players could figure out the right speed at which to speak. Currently there is no player feedback as to how accurately they’ve spoken, and it is only hit or miss.
 

What feedback did you get on Kotodama from your presentation at the Game Developers Conference in 2005?

It was well received with a lot of industry professionals inquiring if it was under development for commercial release. It wasn’t at the time and still isn’t, but we’re always interested to speak with potential investors, publishers or developers. The primary concern most industry-folk have is regarding the robustness of the speech recognition.

We also presented at:

SEATJ (The Southeastern Association for Teachers of Japanese) in 2005 hosted by Georgia Institute of Technology. (via the Wayback Machine)

CALICO (Computer Assisted Language Instruction Consortium) in 2005 at Michigan State University.

JALTCALL (Japan Association for Language Teaching –CALL) in 2005 at Ritsumeikan University in Japan.

top