Python 100 project #14: Google Cloud Natural Language API

This is more like a introduction of Google Cloud Natural Language API.

I’m trying to scrape the drama transcript from the web, and want to visualize them. In the past project, I’ve used wordcloud quite often, but it merely count the frequency of the word appeared in the sentence. it is of course very big factor to know the importance of that word. I’m going to use Cloud Natural Language API to compare those two result, and hopefully I can find the new things of my favourite dramas.

Usually, I use the third party library if there exists, and this google cloud natural language also has a python library called google-cloud-python. But this time I use simple requests to see how the raw transaction looks like.

 

[ analyzeEntities ]

import requests

MyAPIKEY = "your-api-key"

url = "https://language.googleapis.com/v1/documents:analyzeEntities?key={}"

says = "They're made of plastic. Living plastic creatures. They're being controlled by a relay device in the roof, which would be a great big problem if I didn't have this. So, I'm going to go up there and blow them up, and I might well die in the process, but don't worry about me. No, you go home. Go on. Go and have your lovely beans on toast. Don't tell anyone about this, because if you do, you'll get them killed."

params = {
    "document": {
        "type": "PLAIN_TEXT",
        "content": says,
    },
    "encodingType": "UTF8"
}

r = requests.post(url.format(MyAPIKEY), json=params)

r.json()
{'entities': [{'name': 'relay device',
   'type': 'OTHER',
   'metadata': {},
   'salience': 0.5245988,
   'mentions': [{'text': {'content': 'relay device', 'beginOffset': 81},
     'type': 'COMMON'},
    {'text': {'content': 'problem', 'beginOffset': 134}, 'type': 'COMMON'}]},
  {'name': 'plastic',
   'type': 'OTHER',
   'metadata': {},
   'salience': 0.2275309,
   'mentions': [{'text': {'content': 'plastic', 'beginOffset': 16},
     'type': 'COMMON'}]},
  {'name': 'creatures',
   'type': 'OTHER',
   'metadata': {},
   'salience': 0.11286028,
   'mentions': [{'text': {'content': 'creatures', 'beginOffset': 40},
     'type': 'COMMON'}]},
  {'name': 'roof',
   'type': 'OTHER',
   'metadata': {},
   'salience': 0.04330202,
   'mentions': [{'text': {'content': 'roof', 'beginOffset': 101},
     'type': 'COMMON'}]},
  {'name': 'process',
   'type': 'OTHER',
   'metadata': {},
   'salience': 0.027145086,
   'mentions': [{'text': {'content': 'process', 'beginOffset': 240},
     'type': 'COMMON'}]},
  {'name': 'toast',
   'type': 'OTHER',
   'metadata': {},
   'salience': 0.020346878,
   'mentions': [{'text': {'content': 'toast', 'beginOffset': 332},
     'type': 'COMMON'}]},
  {'name': 'beans',
   'type': 'OTHER',
   'metadata': {},
   'salience': 0.0200992,
   'mentions': [{'text': {'content': 'beans', 'beginOffset': 323},
     'type': 'COMMON'}]},
  {'name': 'anyone',
   'type': 'PERSON',
   'metadata': {},
   'salience': 0.015136869,
   'mentions': [{'text': {'content': 'anyone', 'beginOffset': 350},
     'type': 'COMMON'}]},
  {'name': 'home',
   'type': 'LOCATION',
   'metadata': {},
   'salience': 0.008979974,
   'mentions': [{'text': {'content': 'home', 'beginOffset': 286},
     'type': 'COMMON'}]}],
 'language': 'en'}

 

[ analyzeSentiment ]

url2 = 'https://language.googleapis.com/v1/documents:analyzeSentiment?key={}'

r2 = requests.post(url2.format(MyAPIKEY), json=params)

r2.json()
{'documentSentiment': {'magnitude': 3.1, 'score': 0},
 'language': 'en',
 'sentences': [{'text': {'content': "They're made of plastic.",
    'beginOffset': 0},
   'sentiment': {'magnitude': 0.1, 'score': -0.1}},
  {'text': {'content': 'Living plastic creatures.', 'beginOffset': 25},
   'sentiment': {'magnitude': 0.3, 'score': 0.3}},
  {'text': {'content': "They're being controlled by a relay device in the roof, which would be a great big problem if I didn't have this.",
    'beginOffset': 51},
   'sentiment': {'magnitude': 0.3, 'score': -0.3}},
  {'text': {'content': "So, I'm going to go up there and blow them up, and I might well die in the process, but don't worry about me.",
    'beginOffset': 165},
   'sentiment': {'magnitude': 0.5, 'score': 0.5}},
  {'text': {'content': 'No, you go home.', 'beginOffset': 275},
   'sentiment': {'magnitude': 0.1, 'score': -0.1}},
  {'text': {'content': 'Go on.', 'beginOffset': 292},
   'sentiment': {'magnitude': 0.1, 'score': 0.1}},
  {'text': {'content': 'Go and have your lovely beans on toast.',
    'beginOffset': 299},
   'sentiment': {'magnitude': 0.9, 'score': 0.9}},
  {'text': {'content': "Don't tell anyone about this, because if you do, you'll get them killed.",
    'beginOffset': 339},
   'sentiment': {'magnitude': 0.6, 'score': -0.6}}]}

 

It is interesting as the word ‘relay device’ has the salience value of 0.5245988, though the frequency is still the same as the other words. It should be very interesting if I gather all these result from whole Dr. Who episodes.