Google Knowledge Graph Search API With Python

How To Use Google Knowledge Graph Search API With Python

Have you stumbled upon something called Google Knowledge Graph Search and you’re not sure what it is?

Stick around, in this post, we’ll delve into what it is and how to access its enourmous database with API. Furthermore, we’ll make an example Python script with which we’ll be able to retrieve and store information from the API.

Basically, it’s an algorithm responsible for the quick answers that appear on Google’s search results page. Furthermore, these include the featured snippets that appear right below the search bar, and knowledge panels that appear on the right side of the search results.

Moreover, it’s powered by machine learning and a massive dataset consisting of various different topics. It’s purpose is to help understand the context and intent of a search query by using principles of semantic search.

Using the API

Before we start writing any code, we should first setup the access to the Knowledge Graph Search API in the Google Cloud Console. In case you haven’t used any APIs from Google Cloud Console yet, don’t worry, we’ll go through the whole thing, step by step.

Setup Knowledge Graph Search API access

  1. Create a project on Google Cloud Console
  2. Navigate to APIs & Services > Credentials
  3. Click on the CREATE CREDENTIALS button and select API key from the dropdown
  4. Copy the API key and store it inside .env file in the project folder
  5. Navigate to Library tab and search for Knowledge Graph Search API in the search bar
  6. Click on the API to open its details and click on the ENABLE button

Alright, after you’ve done all that, you’re good to go and start coding the project.

Prerequisites

Like with any other Python project here, first thing we need to do is import all the necessary libraries and tools. Furthermore, our project will include modules for fetching data with GET requests, parsing JSON data, and storing it locally into a .csv file.

import os
import json
import requests
import argparse
import pandas as pd
from urllib.parse import urlencode
from tqdm import tqdm
from dotenv import load_dotenv

We’re also going to set a couple of necessary things right from the get go, including loading the .env contents, setting maximum width of pandas columns, and setting a constant holding the file path of the project file.

load_dotenv()
pd.set_option('max_colwidth', 150)
ROOT = os.path.dirname(__file__)

Your project folder at this point, should contain two files, the main Python file and the .env file. Furthermore, the contents of the .env file should hold your API key.

KGS_API_KEY=<your API key>

Fetching & parsing API data

Now, we’re going to write two methods, one for fetching the data from the API and another to parse it and return a pandas dataframe. We’re also going to use pandas .to_csv method later to save the data locally.

def get_knowledge_graph(query):
    try:
        endpoint = 'https://kgsearch.googleapis.com/v1/entities:search'
        params = {
            'query': query,
            'limit': 1,
            'indent': True,
            'key': os.getenv('KGS_API_KEY')
        }
        url = endpoint + '?' + urlencode(params)
        response = requests.get(url)
        return json.loads(response.text)
    except:
        return

def get_data(query):

    df = pd.DataFrame(columns=[
        'Name',
        'URL',
        'Description'
    ])

    result = get_knowledge_graph(query)
    
    if not result:
        return

    for res in tqdm(result['itemListElement']):
        try:
            name = res['result']['name']
            description = res['result']['detailedDescription']
        except:
            continue

        data = {}
        data['Name'] = name
        data['URL'] = description['url']
        data['Description'] = description['articleBody']
        df = pd.concat([df, pd.DataFrame.from_dict([data])])

    df.reset_index(inplace=True)
    df.pop('index')
    return df

Put it all together

Last thing we need to do is to put everything in action and call the methods we just created in the main thread. We’re also going to add an argument parser, so we can use the script directly from the console.

if __name__ == '__main__':

    parser = argparse.ArgumentParser(description='QReturns from Google Knowledge Graph')
    parser.add_argument('-q', '--query', help='Provide query to search for in Google Knowledge Graph')
    args = parser.parse_args()

    info = get_data(args.query)
    info.to_csv(os.path.join(ROOT, 'GKGS_results_for_' + args.query + '.csv'))
    print(info.head(50))

Great! We can finally use the script to get the data from the Knowledge Graph Search API. The following command is an example of how we can do it.

python main.py -q "semantic search"

Entire code

Here is also the whole code of the project, which you can also find in the GitHub repository I made for it.

import os
import json
import requests
import argparse
import pandas as pd
from urllib.parse import urlencode
from tqdm import tqdm
from dotenv import load_dotenv

load_dotenv()
pd.set_option('max_colwidth', 150)
ROOT = os.path.dirname(__file__)

def get_knowledge_graph(query):
    try:
        endpoint = 'https://kgsearch.googleapis.com/v1/entities:search'
        params = {
            'query': query,
            'limit': 1,
            'indent': True,
            'key': os.getenv('KGS_API_KEY')
        }
        url = endpoint + '?' + urlencode(params)
        response = requests.get(url)
        return json.loads(response.text)
    except:
        return

def get_data(query):

    df = pd.DataFrame(columns=[
        'Name',
        'URL',
        'Description'
    ])

    result = get_knowledge_graph(query)
    
    if not result:
        return

    for res in tqdm(result['itemListElement']):
        try:
            name = res['result']['name']
            description = res['result']['detailedDescription']
        except:
            continue

        data = {}
        data['Name'] = name
        data['URL'] = description['url']
        data['Description'] = description['articleBody']
        df = pd.concat([df, pd.DataFrame.from_dict([data])])

    df.reset_index(inplace=True)
    df.pop('index')
    return df

if __name__ == '__main__':

    parser = argparse.ArgumentParser(description='QReturns from Google Knowledge Graph')
    parser.add_argument('-q', '--query', help='Provide query to search for in Google Knowledge Graph')
    args = parser.parse_args()

    info = get_data(args.query)
    info.to_csv(os.path.join(ROOT, 'GKGS_results_for_' + args.query + '.csv'))
    print(info.head(50))

Conclusion

To conclude, we made a simple Python script for fetching information from the Google Knowledge Graph Search API. Even more, we explained what the Knowledge Graph Search is and how it’s used on Google’s search results page.

I learned a lot while working on this project, and I hope you will find it helpful as well.

Share this article:

Related posts

Discussion(0)