Natural Language API「構文解析」の概要と使い方

はじめに

本記事では、Natural Language API 構文解析の概要と使い方について解説します。

概要

構文解析では、指定されたテキストを一連の文とトークン（通常は単語）に分解して、それらのトークンに関する言語情報を提供します。

Natural Language APIの構文解析では、指定されたテキストを一連の文とトークン（通常は単語）に分解して、それらのトークンに関する言語情報を提供します。

言語分析の詳細については、形態論と依存関係ツリーをご覧ください。

APIレスポンスの例

リクエストが成功すると、サーバーは 200 OK HTTPステータスコードと以下のようなJSON形式のレスポンスを返します。構文解析のレスポンスでは、partOfSpeech フィールドで品詞情報を返し、dependencyTree フィールドで単語間の構文関係を返します。

[
  "text" : {
    "content": "この",
    "begin_offset": -1
  },
  "partOfSpeech " : {
    "tag": "DET",
    "proper": "NOT_PROPER"
  },
  "dependencyEdge" : {
    "head_token_index": 1,
    "label": "DET"
  },
]

APIレスポンスの解釈

品詞と形態論情報

品詞と形態論情報はレスポンスの partOfSpeech フィールドで返されます。

形態論とは、単語の内部構造について研究したものです。形態論は、1 つの単語の構成要素（語幹、語素、接頭辞、接尾辞など）がどのように配置または変更されて、さまざまな意味を持つようになるかに注目しています。たとえば、英語では、可算名詞を複数形にする場合にはその末尾に「-s」や「-es」を付加し、動詞を過去形にする場合にはその末尾に「-d」や「-ed」を付加します。接尾辞「-ly」は、副詞を作成するために形容詞に追加されます（たとえば、形容词「happy」に「-ly」を追加して、副詞「happyfully」を作成する）。

Natural Language API では、形態素解析を使用して、単語の文法情報を推測します。

構文情報

構文情報はレスポンスの dependencyEdge フィールドで返されます。

構文とは、句と文の構造について研究したものです。構文と形態論は連携して文法的な関係を示しますが、言語が異なればその工程も異なります。たとえば、直接目的語の役割を示す場合、ロシア語では接尾辞（「книгу」の「у」）を使用する一方、英語では単語の順序を使用し、動詞の後に直接目的語を続けます（read the book）。

使い方

Natural Language API に直接送信されたテキスト文字列に対して構文解析を行う例を次に示します。

構文解析を行う関数の定義

入力した文章全体に対する感情と1文ずつの感情を取得して、それぞれ表示するプログラムを作成します。


from google.cloud import language_v1
from google.cloud.language_v1 import enums
from google.cloud.language_v1 import types

def sample_analyze_sentiment(text_content):
    """
    Analyzing Sentiment in a String

    Args:
      text_content The text content to analyze
    """

    client = language_v1.LanguageServiceClient()

    type_ = language_v1.types.Document.Type.PLAIN_TEXT

    document = types.Document(
      content=text_content,
      type=enums.Document.Type.PLAIN_TEXT)

    response = client.analyze_sentiment(document=document)

    # Get overall sentiment of the input document
    print(u"Document sentiment score: {}".format(response.document_sentiment.score))
    print(
        u"Document sentiment magnitude: {}".format(
            response.document_sentiment.magnitude
        )
    )
    # Get sentiment for all sentences in the document
    for sentence in response.sentences:
        print(u"Sentence text: {}".format(sentence.text.content))
        print(u"Sentence sentiment score: {}".format(sentence.sentiment.score))
        print(u"Sentence sentiment magnitude: {}".format(sentence.sentiment.magnitude))

    # the automatically-detected language.
    print(u"Language of the text: {}".format(response.language))

構文解析の実行

text_content = 'この人は、この世の中で、いちばんしあわせな人にちがいありません。'

sample_analyze_syntax(text_content)

実行結果

Token text: この
Location of this token in overall document: -1
Lemma: この
Head token index: 1
Token text: 人
Location of this token in overall document: -1
Lemma: 人
Head token index: 13
Token text: は
Location of this token in overall document: -1
Lemma: は
Head token index: 1
Token text: 、
Location of this token in overall document: -1
Lemma: 、
Head token index: 13
Token text: この
Location of this token in overall document: -1
Lemma: この
Head token index: 5
Token text: 世の中
Location of this token in overall document: -1
Lemma: 世の中
Head token index: 13
Token text: で
Location of this token in overall document: -1
Lemma: で
Head token index: 5
Token text: 、
Location of this token in overall document: -1
Lemma: 、
Head token index: 13
Token text: いちばん
Location of this token in overall document: -1
Lemma: いちばん
Head token index: 9
Token text: しあわせ
Location of this token in overall document: -1
Lemma: しあわせ
Head token index: 11
Token text: な
Location of this token in overall document: -1
Lemma: だ
Head token index: 9
Token text: 人
Location of this token in overall document: -1
Lemma: 人
Head token index: 13
Token text: に
Location of this token in overall document: -1
Lemma: に
Head token index: 11
Token text: ちがいあり
Location of this token in overall document: -1
Lemma: ちがいあり
Head token index: 13
Token text: ませ
Location of this token in overall document: -1
Lemma: ませ
Head token index: 13
Token text: ん
Location of this token in overall document: -1
Lemma: ん
Head token index: 13
Token text: 。
Location of this token in overall document: -1
Lemma: 。
Head token index: 13
Language of the text: ja

〜ウェビナー告知【2025/06/19】〜

「データ・AI駆動で開発するプロダクト組織のあり方」をテーマに、理想のプロダクト組織のあり方、形成の勘所、そしてAI/データ駆動型組織への具体的な転換ステップをご紹介します。競争力のあるプロダクト開発を目指すなら、ぜひご参加ください！

ウェビナーテーマ：『データ・AI駆動で開発するプロダクト組織のあり方』
開催日時：2025年6月19日(木) 10:00~20:00
会場：オンライン開催（Zoom）

お問い合わせはこちらまで

おわりに

本記事では、Natural Language APIの構文解析についての概要とpythonを用いた使用方法について解説しました。