Data Structures and Algorithms in Python – Recursion

Computes the cumulative sum – Recursion

Aim is to write a recursive function which takes an integer and computes the cumulative sum of 0 to that integer. For example, if n=4 , return 4+3+2+1+0, which is 10. We always should take care of the base case. In this case, we have a base case of n =0 (Note, you could have also designed the cut off to be 1). In this case, we have: n + (n-1) + (n-2) + …. + 0. Example:

print (recursion_cululative_sum(5))

Sum of digits – Recursion

Given an integer, create a function which returns the sum of all the individual digits in that integer. For example: if n = 4321, return 4+3+2+1 Example:

print (recursion_sum_digits(12))

Word split – Recursion

Create a function called word_split() which takes in a string phrase and a set list_of_words. The function will then determine if it is possible to split the string in a way in which words can be made from the list of words. You can assume the phrase will only contain words found in the dictionary if it is completely splittable. Example:

 print (word_split('themanran',['the','ran','man']))

Reverse a string – Recursion

Implement a recursive reverse. Example:

 print(reverse_str('hello world'))

List all the permutation of a string – Recursion

Given a string, write a function that uses recursion to output a list of all the possible permutations of that string. For example, given s=’abc’ the function should return [‘abc’, ‘acb’, ‘bac’, ‘bca’, ‘cab’, ‘cba’] This way of doing permutaion is for learning in real scenerios better to use excellant Python library “ltertools” with current approach there are n! permutations, so the it looks that algorithm will take O(n*n!)time Example:

 print(permute('abc'))

Implement fibonacci sequence with simple iteration

# We'll try to find the 9th no in the fibonacci sequence which is 34
print (fibonacci_itertaive(9))
# 34
# 0, 1, 1, 2, 3, 5, 8, 13, 21, 34
# The recursive solution is exponential time Big-O , with O(n).

Implement fibonacci sequence – Recursion

Our function will accept a number n and return the nth number of the fibonacci sequence Remember that a fibonacci sequence: 0,1,1,2,3,5,8,13,21,… starts off with a base case checking to see if n = 0 or 1, then it returns 1. Else it returns fib(n-1)+fib(n+2).

# We'll try to find the 9th no in the fibnacci sequence which is 34
print (fibonacci_recursion(9))
# 0, 1, 1, 2, 3, 5, 8, 13, 21, 34
# The recursive solution is exponential time Big-O , with O(2^n). However, 
# its a very simple and basic implementation to consider

Implement fibonacci sequence – Dynamic programming

Implement the function using dynamic programming by using a cache to store results (memoization). memoization + recursion = dynamic programming

# We'll try to find the 9th no in the fibnacci sequence which is 34
print (fibonacci_dynamic(9))
# 0, 1, 1, 2, 3, 5, 8, 13, 21, 34
# The recursive-memoization solution is exponential time Big-O , with O(n)

Implement coin change problem – Recursion

Given a target amount n and a list (array) of distinct coin values, what’s the fewest coins needed to make the change amount. 1+1+1+1+1+1+1+1+1+1 5 + 1+1+1+1+1 5+5 10 With 1 coin being the minimum amount.

print (coin_change_recursion(8,[1,5]))
# 4
# Note:
# The problem with this approach is that it is very inefficient! It can take many, 
# many recursive calls to finish this problem and its also inaccurate for non 
# standard coin values (coin values that are not 1,5,10, etc.)

Implement coin change problem – Dynamic programming

Given a target amount n and a list (array) of distinct coin values, what’s the fewest coins needed to make the change amount.

1+1+1+1+1+1+1+1+1+1
5 + 1+1+1+1+1
5+5
10
With 1 coin being the minimum amount.
# Caching
target = 74
coins = [1,5,10,25]
known_results = [0]*(target+1)

print (coin_change_dynamic(target,coins,known_results))

Check GitHub for the full working code.

I will keep adding more problems/solutions.

Stay tuned!

Ref:  The inspiration of implementing DS in Python is from this course

Implemeting Data Structures and Algorithms in Python: Problems and solutions Contd..

Contd.. from last post

Array Pair SumArray Pair Sum

Given an integer array, output all the unique pairs that sum up to a specific value k. So the input:

sum_arr_uniq_pairs([1,2,2,3,4,1,1,3,2,1,3,1,2,2,4,0],5) 

would return 2 pairs:
(2,3)  (1,4)

Find a missing element in an array/list

Consider an array of non-negative integers. A second array is formed by shuffling the elements of the first array and deleting a random element. Given these two arrays, find which element is missing in the second array. Here is an example input, the first array is shuffled and the number 5 is removed to construct the second array.
Input:

 find_missing_ele([1,2,3,4,5,6,7],[3,7,2,1,4,6]) 

Output:5 is the missing numberStack class implementation

Implement basic stack operations (LIFO)

push() – Push an element in a stack pop()- POP an element from top of the stackpeek() – Just peek into top element of the stack (don’t perform any operation)Queue class implementation

Implement basic Queue operations (FIFO)

enqueue – adding a element to the queue dequeue – removing an element from the queueDeque (DECK) class implementation

Implement basic operation in deque (Add and remove elements both at front and rear)

addFront()

Add an element at the front

addRear()

Add an element at the rear

removeFront()

Remove from front

removeRear()

Remove from rear

Balance parentheses using stack/list

Given a string of opening and closing parentheses, check whether it’s balanced. We have 3 types of parentheses: round brackets: () square brackets: [] curly brackets: {}. Assume that the string doesn’t contain any other character than these, no spaces words or numbers. As a reminder, balanced parentheses require every opening parenthesis to be closed in the reverse order opened. For example ‘([])’ is balanced but ‘([)]’ is not. Algo will take a string as the input string and will return boolean (TRUE/FALSE) Examples:

print (check_parentheses_match('([])'))
print (check_parentheses_match('[](){([[[]]])'))

 

Queue with 2 stack implementation

This is a classic problem. We need to use the basic characteristics of the stack (popping out elements in reverse order) will make a queue. Example:

# Create a object of the class
qObj = QueueWith2Stack()
# Add an element 
qObj.enqueue(1)
# Add another element
qObj.enqueue(2)
# Add more element
qObj.enqueue(4)
# Add more element
qObj.enqueue(8)
# Remove item
print (qObj.dequeue())
# Remove item 
print (qObj.dequeue())
# Remove item
print (qObj.dequeue())
# Remove item 
print (qObj.dequeue())

 

Singly Linked List class implementation

Implement basic skeleton for a Singly Linked List Example:

# Added node
a = LinkedListNode(1)
b = LinkedListNode(2)
c = LinkedListNode(3)        
# Set the pointers
a.nextnode = bb.nextnode = c
print (a.value)
print (b.value)
print (c.value)
# Print using class 
print (a.nextnode.value)

Doubly Linked List class implementation

Implement basic skeleton for a Doubly Linked List Example:


# Added node
a = DoublyLinkedListNode(1)
b = DoublyLinkedListNode(2)
c = DoublyLinkedListNode(3)        
# Set the pointers
# setting b after a (a before b)
b.prev_node = a
a.next_node = b
# Setting c after a
b.next_node = c
c.prev_node = b
print (a.value)
print (b.value)
print (c.value)
# Print using class 
print (a.next_node.value)
print (b.next_node.value)
print (b.prev_node.value)
print (c.prev_node.value)

 

Reverse a linked list implementation

The aim is to write a function to reverse a Linked List in place. The function will take in the head of the list as input and return the new head of the list. Example:


# Create a Linked List 
a = LinkedListNode(1)
b = LinkedListNode(2)
c = LinkedListNode(3)
d = LinkedListNode(4)
a.nextnode = b
b.nextnode = c
c.nextnode = d
print (a.nextnode.value)
print (b.nextnode.value)
print (c.nextnode.value)
# Call the reverse()
LinkedListNode().reverse(a)
print (d.nextnode.value)
print (c.nextnode.value)
print (b.nextnode.value)

 

Linked list Nth to the last node

The aim is a function that takes a head node and an integer value n and then returns the nth to last node in the linked list. Example:


# Create a Linked List 
a = LinkedListNode(1)
b = LinkedListNode(2)
c = LinkedListNode(3)
d = LinkedListNode(4)
e = LinkedListNode(5)

a.nextnode = b
b.nextnode = c
c.nextnode = d
d.nextnode = e

print (a.nextnode.value)
print (b.nextnode.value)
print (c.nextnode.value)
print (d.nextnode.value)

# This would return the node d with a value of 4, because its the 2nd to last node.
target_node = LinkedListNode().nth_to_last_node(2, a) 
print (target_node.value)
# Ans: d=4


Check GitHub for the full working code.

I will keep adding more problems/solutions.

Stay tuned!

Ref:  The inspiration of implementing DS in Python is from this course

Implemeting Data Structures and Algorithms in Python: Problems and solutions

Recently I have started using Python in a lot of places including writing algorithms for MI/data science,  so I thought to try to implement some common programming problems using data structures in Python. As I have mostly implemented in C/C++ and Perl.

Let’s get started with a very basic problem.

Anagram algorithm

An algorithm will take two strings and check to see if they are anagrams. An anagram is when the two strings can be written using the exact same letters, in other words, rearranging the letters of a word or phrase to produce a new word or phrase, using all the original letters exactly once

Some examples of anagram:
“dormitory” is an anagram of “dirty room”
“a perfectionist” is an anagram of “I often practice.”
“action man” is an anagram of “cannot aim”

Our anagram check algorithm with take two strings and will give a boolean TRUE/FALSE depends on anagram found or not?
I have used two approaches to solve the problem. First is to sorted function and compare two string after removing white spaces and changing to lower case. This is straightforward.

like


def anagram(str1,str2):
# First we'll remove white spaces and also convert string to lower case letters
str1 = str1.replace(' ','').lower()
str2 = str2.replace(' ','').lower()
# We'll show output in the form of boolean TRUE/FALSE for the sorted match hence return
return sorted(str1) == sorted(str2)

The second approach is to do things manually, this is because to learn more about making logic to check. In this approach, I have used a counting mechanism and Python dictionary to store the count letter. Though one can use inbuilt Python collections idea is to learn a bit about the hash table.

Check GitHub for the full working code.

I will keep adding more problems/solutions.

Stay tuned!

Ref:  The inspiration of implementing DS in Python is from this course

Choropleth Maps in Python

Choropleth maps are a great way to represent geographical data. I have done a basic implementation of two different data sets. I have used jupyter notebook to show the plots.

World Power Consumption 2014

First do Plotly imports

import plotly.graph_objs as go
from plotly.offline import init_notebook_mode,iplot
init_notebook_mode(connected=True)

Next step is to fetch the dataset, we’ll use Python pandas library to read the read the csv file

import pandas as pd
df = pd.read_csv('2014_World_Power_Consumption')

Next, we need to create data and layout variable which contains a dict

data = dict(type='choropleth',
locations = df['Country'],
locationmode = 'country names', z = df['Power Consumption KWH'],
text = df['Country'], colorbar = {'title':'Power Consumption KWH'},
colorscale = 'Viridis', reversescale = True)

Let’s make a layout

layout = dict(title='2014 World Power Consumption',
geo = dict(showframe=False,projection={'type':'Mercator'}))

Pass the data and layout and plot using iplot

choromap = go.Figure(data = [data],layout = layout)
iplot(choromap,validate=False)

The output will be be like below:

Check github for full code.

In next post I will try to make a choropleth for a different data set.

 References: 

https://www.udemy.com/python-for-data-science-and-machine-learning-bootcamp

      https://plot.ly/python/choropleth-maps/

Developing data products course project

I have made a small project which demonstrate Water Quality of River Ganga (India) in various places on-route (Year 2012) as a part of JHU Coursera Data Science specialization.

This project have two parts:

  1. Created a Shiny Application

I have created a Shiny Application to demonstrate Water Quality of River Ganga (India) in various places on-route (Year 2012)

https://ppant.shinyapps.io/Course-Project-Data-Products-Shiny-Application/

  1. Created an R presentation to pitch the idea with key features of the project, source code and other links

http://rpubs.com/ppant/DevDataProductsPres

References: Data set is given from https://data.gov.in (Open Government Data Platform India)

I will do more improvement in future to give more precise results and better visualization.

Check code at Github.

 

 

Interesting Machine learning algorithms in R

Widely used Machine learning algorithms in R

  • Linear discriminant analysis (LDA) — MASS package of R can be used
  • Regression (Linear & Logistic)
  • Naive Bayes
  • Support vector machines (SVM)
  • Classification and regression trees
  • Random forests (Tree based modelling) — There is excellent package randomForest in R
  • K-Means clustering — Kmeans package of R can be used
  • Boosting

One must check caret package of R it has plenty of function to perform many MI tasks like classification, training etc.  Finally,  CRAN is the place one should visit for R packages.

JHU Data Science Specialization Capstone

I have created a text prediction application as a part of Coursera Johns Hopkins University Capstone project.

Check below for resources.

Next Word Text Prediction Algorithm — Data Science Capstone Project by JHU and Swiftkey

Presentation:

http://rpubs.com/ppant/capstone-presentation

Application:

https://ppant.shinyapps.io/nextWordPredict/

Code:

https://github.com/ppant/Coursera-Data-Science-Capstone-Project

 

Request to use and provide your valuable suggestions for improvement.

Thanks

Stanford Machine learning class slides

Andrew NG Machine learning class is the best class so far which I took online.

Apart from the course video sometimes lecture slides are also important for quick reference. For quite some time, I was looking for them as they are not available on course home.

Here all the lecture slides available at:
https://d396qusza40orc.cloudfront.net/ml/docs/slides/Lecture1.pdf

Lecture2.pdf

Lecture3.pdf

Lecture4.pdf

and so on…

 

My own experience slides only make sense if you go through the full video course.  Professor is an amazing teacher.

 

Enjoy learning.

 

Getting and cleaning data using R programming project notes

Brief notes of my learning from course project of getting and cleaning data course from John Hopkins University.

The purpose of this project is to demonstrate the ability to collect, work with, and clean a data set. Final goal here is to prepare tidy data that can be used for later analysis.

One of the most exciting areas in all of the data science right now is wearable computing – see for example companies like Fitbit, Nike, tomtom, Garmin etc are racing to develop the most advanced algorithms to attract new users. In this case study, the data is collected from the accelerometers from the Samsung Galaxy S smartphone. A full description is available at the site where the data was obtained:

http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones

Here is the dataset for the project:

https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip

I have created an R script called run_analysis.R which does the following.

  • Merges the training and the test sets to create one data set.
  • Extracts only the measurements on the mean and standard deviation for each measurement.
  • Uses descriptive activity names to name the activities in the data set.
  • Appropriately labels the data set with descriptive variable names.
  • Finally, creates a second, independent tidy data set with the average of each variable for each activity and each subject.References:

http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
https://www.coursera.org/learn/data-cleaning
https://github.com/ppant/getting-and-cleaning-data-project-coursera

 

For working code and tidy dataset please check my Github repo.