Software engineer at SwiftKey. Expert in natural language processing, machine learning and data compression. Skillful at C++, Python, Java and Clojure. Speaks Japanese (native), Chinese (beginner) and English (professional).
Software Engineer, Languages Team
Working on natural langauge processing for text input method.
Software Engineer, R&D Department
Built a prototype of Japanese input method using C++ including kana kanji conversion, next word prediction and spelling correction. Developed high speed and memory-efficient dictionary compression algorithm, reducing memory consumption by 40%. Achieved 93% conversion accuracy (F score by character), outperforming Google Japanese input.
Designed a phrase extraction technique for a predictive input method, including spam filtering, morphological analysis, and pronunciation inference algorithms; achieved 0.90 precision and 0.81 recall, reduced size of predictive dictionary by 80%.
Successfully developed and published the results on a data processing system for 1 TB Japanese web corpus crawled from the Internet and N-gram counting program using Hadoop MapReduce. Created an algorithm twice faster than a naive approach, achieved 5.65 bit cross entropy in the best case.
Developer of Social IME: Cloud-Based Japanese Input Method (in Japanese)
Developed Japanese input method based on crowed sourcing to enable people to effectively share dictionaries on servers.
Reduced input time by 21% and keystrokes by 26% with predictive input method. Achieved 25 million API accesses per month with over 290,000 unique users.
Computer Game Development, 2002-2006
Developed various computer games for Windows. Developed entire part of the last game, which consists of over 20,000 lines of C++, beautiful graphics and music. Sold to more than 1,000 customers and earned 1 million yen.
Publications (First Author Only)
- An Ensemble Model of Word-based and Character-based Models for Japanese and Chinese Input Method, Workshop on Advances in Text Input Methods, the 24th International Conference on Computational Linguistics, 2012.
- Applying mpaligner to Statistical Machine Transliteration with Japanese-Specific Heuristics, The 4th Named Entities Workshop, in the 50th Annual Meeting of the Association of Computational Linguistics, 2012.
- Phrase Extraction for Japanese Predictive Input Method as Post-Processing, Workshop on Advances in Text Input Methods, in the 5th International Joint Conference on Natural Language Processing, 2011.
- Spell Generation based on Edit Distance, Spelling Alteration for Web Search Workshop, 2011.
- Language Model Building and Evaluation using A Large-Scale Japanese Blog Corpus (in Japanese), The 17th Annual Meeting of The Association for Natural Language Processing, 2011.
- Japanese Input Method based on the Internet (in Japanese), Information Processing Society of Japan, Special Interest Group on Natural Language, No.190, 2009.
- Translator, Machine Learning for Hackers (in Japanese), O'Reilly, 2012.
- Supervisor, Technologies behind Japanese Input Methods (in Japanese), Gihyo, 2012.
- Translator, Mining the Social Web (in Japanese), O'Reilly, 2011.
- Supervisor, Natural Language Processing with Python (in Japanese), O'Reilly, 2010.
- Supervisor, Introduction to Machine Learning for Language Processing (in Japanese), Corona, 2010.
- Organizer, Workshop on Advances in Text Input Methods, 2012.
- TOEIC score of 900, 2012.
- TOEFL iBT score of 86, 2012.
- State of Accomplishment, Natural Language Processing Class (Stanford University), 2012.
- Winner of 5th place, Microsoft Speller Challenge, 2011.
- Member of Program Committee, Workshop on Advances in Text Input Methods, 2011.
- TopCoder rating: 1480, 2011.
- Founder of TokyoNLP, Tokyo, Japan, 2010-Present
Keio University, Faculty of Science and Technology, Department of Information and Computer Science, Hagiwara Laboratory, Tokyo, Japan
Master of Computer Science (Emphasis in Natural Language Processing), 2009
Thesis: Japanese Input Method based on the Internet
Bachelor of Computer Science (Emphasis in Machine Learning), 2007
Thesis: Neural Network for Collaborative Filtering