You have watched Iron Man, right? Have you ever wondered how it would be to have your own JARVIS, your own AI voice assistant? Just imagine how easy it would be to search on Wikipedia, or Google or play videos on YouTube, or even send E-Mails, just with a single voice command. In this article, I will show you how you can make your AI personal assistant using Python.
What can the Assistant do?
- It can play music for you.
- It can do Wikipedia, and Google searches for you.
- It is capable of opening websites like Google, YouTube, etc., in a web browser.
- It is capable of opening your Applications with a single voice command.
- And More,
Without wasting much of your time, Let’s start making your A.I.!!!
Open your IDE
Open your preferred IDE, I am going to use VS Code, but you can use any. Start a new project and create a file named assistant.py
Speak Function
To make our AI assistant be able to talk, it should be able to speak first. For that, we will define a speak() function. It will take audio as an argument and then pronounce it.
def speak(audio):
pass #we will give conditions later.
Next up, we need audio for our assistant, for it to pronounce it. For that, we will use the python module called pyttsx3
What is PYTTSX3?
- A Python library that will help us to convert text to speech. In short, it is a text-to-speech library. It works offline.
To Install it, open cmd or terminal and type the following command.
- pip install pyttsx3
In case of any errors, like
- No module named win32com.client
- No module named win32
- No module named win32api
Then Install pypiwin32
- pip install pypiwin32
After successfully installing pyttsx3, import this module into your program.
import pyttsx3
engine = pyttsx3.init('sapi5')
voices= engine.getProperty('voices') #getting details of current voice
engine.setProperty('voice', voice[0].id)
- Now, What is sapi5?
- SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows applications.
- What is VoiceId?
- VoiceId helps us to select different voices.
- voice[0].id = Male voice
- voice[1].id = Female voice
Writing our speak() function
Let’s program our speak() function that we created earlier, so that it will convert our text to speech.
def speak(audio):
engine.say(audio)
engine.runAndWait() #Without this command, speech will not be audible to us.
Now that’s done, let’s create our main() function
main() Function
Now, We will create the main function of our assistant and then we will call our speak() function in it.
if __name__=="__main__" :
speak("Hello, Geek!")
Now, whatever you will write inside this speak() function will be converted into speech.
Whoohooo 🥳🥳 !!!. Now our assistant has its own voice and it can speak.
Creating different functions that our AI can perform
wishme() Function
After starting the AI we want it to first greet us, right ?. For that, we will create a wishme() function, so that it can greet and wish us according to the time.
To provide the current time and date to our AI we will import a module called datetime
import datetime
Now, define wishme() function
def wishme():
time = int(datetime.datetime.now().hour)
We have stored the value of current time into a variable called “time”. We will use this value inside an if-else loop.
Creating a Function that takes Command Input
As a voice assistant, it needs to take command, with the help of the microphone of the system. For that we will create a takecommand() function, using which our AI will be able to return a string output by taking microphone input from the user.
First, we need to install a module named speechRecognition into our program.
- pip install speechRecognition
After installation, import this module in the program
import speechRecognition
Let’s define our takecommand() function
def takecommand():
#It takes microphone input from the user and returns string output
r = sr.Recognizer()
with sr.Microphone() as source:
print("Listening...")
r.pause_threshold = 1
audio = r.listen(source)
The takecommand() function is created. Now, we will add try and except block to manage errors efficiently.
try:
print("Recognizing...")
query = r.recognize_google(audio, language='en-in') #Using google for voice recognition.
print(f"User said: {query}\n") #User query will be printed.
except Exception as e:
# print(e)
print("Say that again please...") #Say that again will be printed in case of improper voice
return "None" #None string will be returned
return query
Defining Tasks for our AI.
Now, that our AI is ready to take commands, let us create some tasks which it can perform. For e.g. Wikipedia searches, Google searches, opening applications, etc
Task 1 – Wikipedia Search
For our AI to perform wikipedia search, we have to install and import a module called wikipedia into our program.
- pip install wikipedia
import wikipedia
After that, write the logic for the task
if __name__ == "__main__":
wishMe()
while True:
# if 1:
query = takeCommand().lower() #Converting user query into lower case
# Logic for executing tasks based on query
if 'wikipedia' in query: #if wikipedia found in the query then this block will be executed
speak('Searching Wikipedia...')
query = query.replace("wikipedia", "")
results = wikipedia.summary(query, sentences=4)
speak("According to Wikipedia")
print(results)
speak(results)
In the above code, we used an if loop to check if “wikipedia” is in the search query of the user or not. If Wikipedia is found in the user’s search query, then a few sentences from the summary of the Wikipedia page will be converted to speech with the help of the speak function.
Task 2 – Opening YouTube in Web-Browser
To open youtube or any other website using AI, we need to import a module called webbrowser
import webbrowser
It is an in-built module so no need for installation.
elif 'open youtube' in query:
webbrowser.open("youtube.com")
Here, we used an elif loop to check whether “youtube” is in the query or not. If it is present then the AI will use the webbrowser module and then open it in the default web-browser of the system. You can use the same logic from above code for any other website
Task 3 – Play Music
For our AI to be able to play music we have to import another module called os.
import os
elif 'play music' in query:
music_dir = 'PATH TO YOUR MUSIC DIRECTORY' #Enter the path of your music directory
songs = os.listdir(music_dir)
print(songs)
os.startfile(os.path.join(music_dir, songs[0]))
In the above code, first we opened the directory where the songs are and then listed all the songs present in the directory with the help of the os module. Then, with the help of os.startfile, you can play any song of your choice. The above code will play the first song in the list. However, you can also play a random song with the help of a random module. Every time you command to play music, AI will play any random song from the song directory.
Task 5 – Know Time
elif 'the time' in query:
strTime = datetime.datetime.now().strftime("%H:%M:%S")
speak(f"Sir, the time is {strTime}")
Above code uses the datetime() function and stores the current or live time of the system into a variable called strTime. After storing the time in strTime, we pass this variable as an argument in speak function. And then, the time string will be converted into the speech.
Task 6 – To Open an Application
elif 'open notepadplusplus' or 'start notepadplusplus' in query:
app = "C:\Tools\Notepad++\notepad++.exe" #Add the path of your app
os.startfile(app)
Here, we are again using the os module to open the app. First we are storing the target file in the string called ‘app’. Then using os.startfile we are opening the file.
You can use the same logic for any other app you want to open.
- Replace ‘notepadplusplus’ with the name of the app you want to open.
How to get the path of the app
- Right-click on the app and select “Open file location”
- After opening, right-click on the application
- Copy the content of “Location” under the General tab.
Let’s see what we have done so far.
- First, we created a wishme() function that gives the functionality of greeting according to the system time to our A.I.
- After wishme() function, we created a takeCommand() function, to help our A.I. take commands from the user. This function is also responsible for returning the user’s query in a string format.
- We developed code to open different websites like youtube or others.
- We developed code to open any application.
Complete code:
import pyttsx3 #pip install pyttsx3
import speech_recognition as sr #pip install speechRecognition
import datetime
import wikipedia #pip install wikipedia
import webbrowser
import os
engine = pyttsx3.init('sapi5')
voices = engine.getProperty('voices')
# print(voices[1].id)
engine.setProperty('voice', voices[0].id)
def speak(audio):
engine.say(audio)
engine.runAndWait()
def wishMe():
hour = int(datetime.datetime.now().hour)
if hour>=0 and hour<12:
speak("Good Morning!")
elif hour>=12 and hour<18:
speak("Good Afternoon!")
else:
speak("Good Evening!")
speak("I am your assistant. Please tell me how may I help you")
def takeCommand():
#It takes microphone input from the user and returns string output
r = sr.Recognizer()
with sr.Microphone() as source:
print("Listening...")
r.pause_threshold = 1
audio = r.listen(source)
try:
print("Recognizing...")
query = r.recognize_google(audio, language='en-in')
print(f"User said: {query}\n")
except Exception as e:
# print(e)
print("Say that again please...")
return "None"
return query
if __name__ == "__main__":
wishMe()
while True:
# if 1:
query = takeCommand().lower()
# Logic for executing tasks based on query
if 'wikipedia' in query:
speak('Searching Wikipedia...')
query = query.replace("wikipedia", "")
results = wikipedia.summary(query, sentences=2)
speak("According to Wikipedia")
print(results)
speak(results)
elif 'open youtube' in query:
webbrowser.open("youtube.com")
elif 'open google' in query:
webbrowser.open("google.com")
elif 'play music' in query:
music_dir = 'PATH TO YOUR MUSIC DIRECTORY' #Enter the path of your music directory
songs = os.listdir(music_dir)
print(songs)
os.startfile(os.path.join(music_dir, songs[0]))
elif 'the time' in query:
strTime = datetime.datetime.now().strftime("%H:%M:%S")
speak(f"Sir, the time is {strTime}")
elif 'open notepadplusplus' in query:
app = "C:\Tools\Notepad++\notepad++.exe"
os.startfile(app)
Is it really like Tony Stark’s JARVIS ?
Many people will argue that the virtual assistant that we have created is not an A.I., but just an output of the bunch of the statements we wrote. But, what is an A.I. basically, the sole purpose of A.I. is to develop machines that can perform human tasks with the same effectiveness or even more effectively than humans. And our “AI” is effecient to do that.