Yoh Okuno's Resume
Accomplished, creative, experienced Software Engineer with expertise in natural language processing, machine learning, data mining, C/C++, Python, and Hadoop. Proven track record of applying theoretical knowledge demonstrated by academic papers and the development of open source software and commercial software. Language skills include Japanese (native) and English (advanced).
- Middleware: Hadoop, Apache, MySQL, Thrift, NLTK, mecab, marisa-trie, DirectX
- Platforms: Linux, Android, Windows, Mac OS X, FreeBSD
Software Engineer, Language Team
Working on natural langauge processing for text input method.
Yahoo Japan Corporation, 2009-2012
Software Engineer, R&D Department
Built a prototype of statistical kana kanji conversion engine using C++ including predictive input method and spelling correction components. Developed high speed and memory-efficient dictionary compression and search engine, reducing memory consumption by 40%. Achieved 93% conversion accuracy (F score by character), outperforming Google Japanese IME (Mozc).
Designed a phrase extraction technique for a predictive input method, including spam filtering, morphological analysis, and pronunciation inference algorithms; achieved 0.90 precision and 0.81 recall, reduced size of predictive dictionary by 80%.
Successfully developed and published the results on a data processing system for 1TB Japanese blog corpus crawled from the Internet and N-gram (N=1 to 7) counting program using Hadoop MapReduce. Created an algorithm 2x faster than a naive approach, achieved 5.65 bit cross entropy in the best case.
Exploratory Software Project, 2007-2008
Developer of Social IME: Cloud-Based Japanese Input Method (in Japanese)
Developed, implemented, and published a framework combining two technologies, cloud computing and input method; enables people to effectively share dictionaries on servers.
Reduced input time by 21% and keystrokes by 26% with predictive input method. Achieved 18 million accesses per month with over 7 million unique users per month.
Computer Game Development, 2006
Developed a computer game (in Japanese), a curtain fire shooting game with beautiful graphics. Wrote over 20,000 lines of C/C++ to implement game logic and general framework. Achieved high performance of 60 FPS in slow laptops. Sold to over 1,000 customers for total sales of one million yen, an exceptional case for an individual product.
Publications (First Author)
- An Ensemble Model of Word-based and Character-based Models for Japanese and Chinese Input Method, Workshop on Advances in Text Input Methods, the 24th International Conference on Computational Linguistics, 2012.
- Applying mpaligner to Statistical Machine Transliteration with Japanese-Specific Heuristics, The 4th Named Entities Workshop, in the 50th Annual Meeting of the Association of Computational Linguistics, 2012.
- Phrase Extraction for Japanese Predictive Input Method as Post-Processing, Workshop on Advances in Text Input Methods, in the 5th International Joint Conference on Natural Language Processing, 2011.
- Spell Generation based on Edit Distance, Spelling Alteration for Web Search Workshop, 2011.
- Language Model Building and Evaluation using A Large-Scale Japanese Blog Corpus (in Japanese), The 17th Annual Meeting of The Association for Natural Language Processing, 2011.
- Japanese Input Method based on the Internet (in Japanese), Information Processing Society of Japan, Special Interest Group on Natural Language, No.190, 2009.
- Translator, Machine Learning for Hackers (in Japanese), O'Reilly, 2012.
- Supervisor, Technologies behind Japanese Input Methods (in Japanese), Gihyo, 2012.
- Translator, Mining the Social Web (in Japanese), O'Reilly, 2011.
- Supervisor, Natural Language Processing with Python (in Japanese), O'Reilly, 2010.
- Supervisor, Introduction to Machine Learning for Language Processing (in Japanese), Corona, 2010.
- Organizer, Workshop on Advances in Text Input Methods, 2012.
- TOEIC score of 900, 2012.
- TOEFL iBT score of 86, 2012.
- State of Accomplishment, Natural Language Processing Class (Stanford University), 2012.
- Winner of 5th place, Microsoft Speller Challenge, 2011.
- Member of Program Committee, Workshop on Advances in Text Input Methods, 2011.
- TopCoder rating: 1480, 2011.
- Founder of TokyoNLP, Tokyo, Japan, 2010-Present
Keio University, Faculty of Science and Technology, Department of Information and Computer Science, Hagiwara Laboratory, Tokyo, Japan
Master of Computer Science (Emphasis in Natural Language Processing), 2009
Thesis: Japanese Input Method based on the Internet
Bachelor of Computer Science (Emphasis in Machine Learning), 2007
Thesis: Neural Network for Collaborative Filtering