Tutorial for creation

In this tutorial, we demonstrate the integration of Python’s robust data manipulation library -with Google Cloud’s advanced generative capabilities through the Google. The Generativai Package and the Gemini Pro model. By setting up the environment with the necessary libraries, configuring the Google Cloud API key and utilizing the iPython display features, the code gives a step-by-step approach to building a data scientific agent that analyzes a examples of the sales data set. The example shows how to convert a data frame to Markdown format and then use natural language queries to generate insights about the data that highlights the potential of combining traditional data analysis tools with modern AI-driven methods.

!pip install pandas google-generativeai --quiet

First, we install Pandas and Google Generativai libraries quietly, set up the environment for data manipulation and AI-driven analysis.

import pandas as pd
import google.generativeai as genai
from IPython.display import Markdown

We import pandas into data manipulation, Google.Generativai to access Google’s generative AI capabilities and Markdown from iPython.display to provide Markdown formatted output.

GOOGLE_API_KEY = "Use Your API Key Here"
genai.configure(api_key=GOOGLE_API_KEY)


model = genai.GenerativeModel('gemini-2.0-flash-lite')

We assign a placeholder API key, configure Google. The Generativai client with it and initializes ‘Gemini-2.0-Flash-Lite’ generative model for generating content.

data = {'Product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Headphones'],
        'Category': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics'],
        'Region': ['North', 'South', 'East', 'West', 'North', 'South'],
        'Units Sold': [150, 200, 180, 120, 90, 250],
        'Price': [1200, 25, 75, 300, 50, 100]}
sales_df = pd.DataFrame(data)


print("Sample Sales Data:")
print(sales_df)
print("-" * 30)

Here we create a Panda’s data frame named Sales_DF containing sample sales data for different products and then printing data frame followed by a separator line to visually distinguish output.

def ask_gemini_about_data(dataframe, query):
    """
    Asks the Gemini Pro model a question about the given Pandas DataFrame.


    Args:
        dataframe: The Pandas DataFrame to analyze.
        query: The natural language question about the DataFrame.


    Returns:
        The response from the Gemini Pro model as a string.
    """
    prompt = f"""You are a data analysis agent. Analyze the following pandas DataFrame and answer the question.


    DataFrame:
    ```
    {dataframe.to_markdown(index=False)}
    ```


    Question: {query}


    Answer:
    """
    response = model.generate_content(prompt)
    return response.text

Here we construct a Markdown-formated prompt from a Pandas data drame and a natural language request and then use the Gemini Pro model to generate and return an analytical response.

# Query 1: What is the total number of units sold across all products?
query1 = "What is the total number of units sold across all products?"
response1 = ask_gemini_about_data(sales_df, query1)
print(f"Question 1: {query1}")
print(f"Answer 1:\n{response1}")
print("-" * 30)
Inquiry 1 Output
# Query 2: Which product had the highest number of units sold?
query2 = "Which product had the highest number of units sold?"
response2 = ask_gemini_about_data(sales_df, query2)
print(f"Question 2: {query2}")
print(f"Answer 2:\n{response2}")
print("-" * 30)
Inquiry 2 Output
# Query 3: What is the average price of the products?
query3 = "What is the average price of the products?"
response3 = ask_gemini_about_data(sales_df, query3)
print(f"Question 3: {query3}")
print(f"Answer 3:\n{response3}")
print("-" * 30)
Inquiry 3 Output
# Query 4: Show me the products sold in the 'North' region.
query4 = "Show me the products sold in the 'North' region."
response4 = ask_gemini_about_data(sales_df, query4)
print(f"Question 4: {query4}")
print(f"Answer 4:\n{response4}")
print("-" * 30)
Inquiry 4 Output
# Query 5. More complex query: Calculate the total revenue for each product.
query5 = "Calculate the total revenue (Units Sold * Price) for each product and present it in a table."
response5 = ask_gemini_about_data(sales_df, query5)
print(f"Question 5: {query5}")
print(f"Answer 5:\n{response5}")
print("-" * 30)
Inquiry 5 Output

Finally, the tutorial successfully illustrates how the synergy between pandas, Google.Generativai pack and the Gemini Pro model can transform data analysis tasks into a more interactive and insightful process. The procedure simplifies the inquiry and interpretation of data and opens up ways of advanced use such as data cleaning, functional technique and investigative data analysis. By utilizing these advanced tools within the well-known Python ecosystem, data scientists can improve their productivity and innovation, making it easier to derive meaningful insights from complex data sets.


Here it is Colab notebook. Nor do not forget to follow us on Twitter and join in our Telegram Channel and LinkedIn GrOUP. Don’t forget to take part in our 85k+ ml subbreddit.


Asif Razzaq is CEO of Marketchpost Media Inc. His latest endeavor is the launch of an artificial intelligence media platform, market post that stands out for its in -depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts over 2 million monthly views and illustrates its popularity among the audience.

Leave a Comment