Natural Language API「エンティティ分析」の概要と使い方

はじめに

本記事では、Natural Language API エンティティ分析の概要と使い方について解説します。

エンティティ分析は、指定されたテキストに既知のエンティティ（著名人、ランドマークなどの固有名詞）が含まれていないかどうかを調べて、それらのエンティティに関する情報を返します。

エンティティ分析の概要

エンティティのタイプについては、エンティティのドキュメントを参照してください。

例えば、以下のようなタイプがあります。

エンティティのタイプ	意味
UNKNOWN	わからない
PERSON	人
LOCATION	位置
ORGANIZATION	組織
EVENT	イベント
WORK_OF_ART	アートワーク
CONSUMER_GOOD	消費財
OTHER	その他の種類のエンティティ

APIレスポンスの例

リクエストが成功すると、サーバーは 200 OK HTTPステータスコードと以下のようなJSON形式のレスポンスを返します。

entities 配列には、検出されたエンティティを表す Entity オブジェクトが格納されます。このオブジェクトには、エンティティの名前や型などの情報が含まれています。
エンティティは、テキスト全体に対する関連性を表す salience スコアの高い順に返されます。

{
  "entities": [
    {
      "name": "Trump",
      "type": "PERSON",
      "metadata": {
        "mid": "/m/0cqt90",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Donald_Trump"
      },
      "salience": 0.7936003,
      "mentions": [
        {
          "text": {
            "content": "Trump",
            "beginOffset": 10
          },
          "type": "PROPER"
        },
        {
          "text": {
            "content": "President",
            "beginOffset": 0
          },
          "type": "COMMON"
        }
      ]
    },
    {
      "name": "White House",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/081sq",
        "wikipedia_url": "https://en.wikipedia.org/wiki/White_House"
      },
      "salience": 0.09172433,
      "mentions": [
        {
          "text": {
            "content": "White House",
            "beginOffset": 36
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Pennsylvania Ave NW",
      "type": "LOCATION",
      "metadata": {
        "mid": "/g/1tgb87cq"
      },
      "salience": 0.085507184,
      "mentions": [
        {
          "text": {
            "content": "Pennsylvania Ave NW",
            "beginOffset": 65
          },
          "type": "PROPER"
        }
      ]
    },
    {
      "name": "Washington, DC",
      "type": "LOCATION",
      "metadata": {
        "mid": "/m/0rh6k",
        "wikipedia_url": "https://en.wikipedia.org/wiki/Washington,_D.C."
      },
      "salience": 0.029168168,
      "mentions": [
        {
          "text": {
            "content": "Washington, DC",
            "beginOffset": 86
          },
          "type": "PROPER"
        }]
    }
    {
      "name": "1600 Pennsylvania Ave NW, Washington, DC",
      "type": "ADDRESS",
      "metadata": {
        "country": "US",
        "sublocality": "Fort Lesley J. McNair",
        "locality": "Washington",
        "street_name": "Pennsylvania Avenue Northwest",
        "broad_region": "District of Columbia",
        "narrow_region": "District of Columbia",
        "street_number": "1600"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "1600 Pennsylvania Ave NW, Washington, DC",
            "beginOffset": 60
          },
          "type": "TYPE_UNKNOWN"
        }]
      }
    }
    {
      "name": "1600",
       "type": "NUMBER",
       "metadata": {
           "value": "1600"
       },
       "salience": 0,
       "mentions": [
         {
          "text": {
              "content": "1600",
              "beginOffset": 60
           },
           "type": "TYPE_UNKNOWN"
        }
     ]
     },
     {
       "name": "October 7",
       "type": "DATE",
       "metadata": {
         "day": "7",
         "month": "10"
       },
       "salience": 0,
       "mentions": [
         {
           "text": {
             "content": "October 7",
             "beginOffset": 105
            },
           "type": "TYPE_UNKNOWN"
         }
       ]
     }
     {
      "name": "7",
      "type": "NUMBER",
      "metadata": {
        "value": "7"
      },
      "salience": 0,
      "mentions": [
        {
          "text": {
            "content": "7",
            "beginOffset": 113
          },
          "type": "TYPE_UNKNOWN"
        }
      ]
    }
  ],
  "language": "en"
}

エンティティ分析の使い方

Natural Language API に直接送信されたテキスト文字列に対してエンティティ分析を行う例を次に示します。

エンティティ分析を行う関数の定義

入力した文章からをエンティティを取得して、それぞれ表示するプログラムを作成します。


from google.cloud import language_v1
from google.cloud.language_v1 import enums
from google.cloud.language_v1 import types

def sample_analyze_entities(text_content):
    """
    Analyzing Entities in a String

    Args:
      text_content The text content to analyze
    """

    client = language_v1.LanguageServiceClient()

    type_ = language_v1.types.Document.Type.PLAIN_TEXT

    document = types.Document(
      content=text_content,
      type=enums.Document.Type.PLAIN_TEXT)

    response = client.analyze_entities(document=document)

    # Loop through entitites returned from the API
    for entity in response.entities:
        print(u"Representative name for the entity: {}".format(entity.name))

        # Get entity type, e.g. PERSON, LOCATION, ADDRESS, NUMBER, et al
        print(u"Entity type: {}".format(entity.type))

        # Get the salience score associated with the entity in the [0, 1.0] range
        print(u"Salience score: {}".format(entity.salience))

        # Loop over the metadata associated with entity. For many known entities,
        # the metadata is a Wikipedia URL (wikipedia_url) and Knowledge Graph MID (mid).
        # Some entity types may have additional metadata, e.g. ADDRESS entities
        # may have metadata for the address street_name, postal_code, et al.
        for metadata_name, metadata_value in entity.metadata.items():
            print(u"{}: {}".format(metadata_name, metadata_value))

        # Loop over the mentions of this entity in the input document.
        # The API currently supports proper noun mentions.
        for mention in entity.mentions:
            print(u"Mention text: {}".format(mention.text.content))

            # Get the mention type, e.g. PROPER for proper noun
            print(
                u"Mention type: {}".format(mention.type)
            )

    # the automatically-detected language.
    print(u"Language of the text: {}".format(response.language))

エンティティ分析の実行

text_content = 'この人は、この世の中で、いちばんしあわせな人にちがいありません。芝居小屋もすばらしいし、お客さんもすばらしい人たちでした。もし中世の時代だったら、おそらく、火あぶりにされたでしょうよ。みんなのうるさいことといったら、まるで、ハエがびんの中で、ブンブンいっているようでした。われわれ人間が、こういうことを考えだすことができるとすれば、われわれは、地の中にうめられるまでに、もっと長生きできてもいいはずだが'

sample_analyze_sentiment(text_content)

実行結果

Representative name for the entity: 人
Entity type: 1
Salience score: 0.41499295830726624
Mention text: 人
Mention type: 2
Mention text: 人
Mention type: 2
Mention text: 人
Mention type: 2
Representative name for the entity: 世の中
Entity type: 2
Salience score: 0.14064659178256989
Mention text: 世の中
Mention type: 2
Representative name for the entity: 芝居小屋
Entity type: 2
Salience score: 0.06455273926258087
Mention text: 芝居小屋
Mention type: 2
Representative name for the entity: 客
Entity type: 1
Salience score: 0.06455273926258087
Mention text: 客
Mention type: 2
Representative name for the entity: こと
Entity type: 7
Salience score: 0.03601405769586563
Mention text: こと
Mention type: 2
Representative name for the entity: こと
Entity type: 7
Salience score: 0.03595975041389465
Mention text: こと
Mention type: 2
Representative name for the entity: 火あぶり
Entity type: 7
Salience score: 0.03429386019706726
Mention text: 火あぶり
Mention type: 2
Representative name for the entity: 中
Entity type: 7
Salience score: 0.03232789412140846
Mention text: 中
Mention type: 2
Representative name for the entity: 時代
Entity type: 7
Salience score: 0.031792446970939636
Mention text: 時代
Mention type: 2
Representative name for the entity: 中世
Entity type: 7
Salience score: 0.031792446970939636
Mention text: 中世
Mention type: 2
Representative name for the entity: 中
Entity type: 7
Salience score: 0.03158647194504738
Mention text: 中
Mention type: 2
Representative name for the entity: 地
Entity type: 2
Salience score: 0.023291485384106636
Mention text: 地
Mention type: 2
Representative name for the entity: 人間
Entity type: 1
Salience score: 0.022534204646945
Mention text: 人間
Mention type: 2
Representative name for the entity: びん
Entity type: 7
Salience score: 0.01783117465674877
Mention text: びん
Mention type: 2
Representative name for the entity: ハエ
Entity type: 7
Salience score: 0.01783117465674877
Mention text: ハエ
Mention type: 2
Language of the text: ja

〜ウェビナー告知【2025/06/19】〜

「データ・AI駆動で開発するプロダクト組織のあり方」をテーマに、理想のプロダクト組織のあり方、形成の勘所、そしてAI/データ駆動型組織への具体的な転換ステップをご紹介します。競争力のあるプロダクト開発を目指すなら、ぜひご参加ください！

ウェビナーテーマ：『データ・AI駆動で開発するプロダクト組織のあり方』
開催日時：2025年6月19日(木) 10:00~20:00
会場：オンライン開催（Zoom）

お問い合わせはこちらまで

おわりに

本記事では、Natural Language APIのエンティティ分析についての概要とpythonを用いた使用方法について解説しました。