Okazaki Laboratory focuses on Natural Language Processing (NLP), computers that manipulate human languages, and aims to realize Artificial Intelligence (AI). For example, we are exploring principles and methods to realize intelligent communication on computers, such as translating foreign language texts, communicating with humans, answering questions, and explaining scenes. We adopt cutting-edge approaches such as deep learning as well as the fundamentals of linguistics, statistics, and machine learning. In addition, we are also interested in developing real-world applications of our research, for example, social listening using big data analysis.

This page describes representative research topics of the laboratory (this is not a comprehensive list; we welcome new research topics).

Summarization / Natural Language Generation

There are several means of communication between humans and computers. Natural language is the most efficient means of communication among them. We are working on research topics such as automatic summarization and headline generation, as well as automatic generation of good text. (Takase et al., 2019) (Matsumaru et al., 2020)

Machine Translation

Machine translation, in which a computer automatically translates text from one language into another, is one of the most widely-used applications of NLP. We are working on models of neural machine translation that take grammatical structure into account, as well as context-sensitive machine translation (when multiple sentences are given). (Shimazu et al., 2020) (Bugliarello et al., 2020)

Sentiment Analysis / Opinion Mining

Research that analyzes people’s opinions from social media posts has its challenges: common sense knowledge. For example, we can expect the speaker of the statement, “We need to promote free trade,” to be probably in favor of the Trans-Pacific Partnership Agreement (TPP). The knowledge about the association between the “TPP” and “free trade” is essential to make such an inference. We explore approaches for accurately recognizing and aggregating people’s opinions with the common-sense knowledge acquired automatically from external documents. (Sasaki et al., 2017) (Sasaki et al., 2018) (Hanawa et al., 2019)

NLP Applications

There are various applications of NLP techniques, for example, analyzing public opinion from big data (SNS posts), and automatically generating and proofreading sentences. We also consider how our research contributes to the real world.

朝日新聞2013年3月13日朝刊2面「震災ツイート昨年より2割増」,朝日新聞2013年7月3日朝刊6面「611万 もう一つの民意」,朝日新聞2013年7月26日朝刊9面「つながる力 次こそ真価」など.朝日新聞社に無断で転載することを禁じる(承諾番号17-6975).

Representation Learning

How should we represent semantics of words and sentences on computers? This has been an outstanding problem on NLP for long. Recently, Deep Learning was applied to NLP to learn vector representations of words from a large corpus and to compose vector representations of sentences from those of constituent words.

Natural Language Understanding

In order for a computer to interact with a human and understand human instructions, it is necessary to classify textual expressions into categories (e.g., person names and places) and to map entity mentions to records in existing databases (e.g., Wikipedia). In addition, we can understand a scene written in natural language or illustrated in an image by associating natural language representations to objects and movements in images.

Information Extraction / Knowledge Acquisition

Humans communicate with each other by interpreting the meaning of a text with the common knowledge such as smoking causes Lung cancer. In order to equip computers with common-sense knowledge, we are exploring methods to automatically acquire knowledge from large data (e.g., Web, Wikipedia, SNS posts).