Shrikar Archak

It does not matter how slow you go so long as you do not stop. ~Confucius

How to Build a Trivia App Using Swift Programming Language

What we will cover

  • Creating a new project.
  • Grand Central Dispatcher
  • How to fetch data from web using Swift.
  • Tweet Functionality
  • Run in Simulator

Requirements Xcode6

Create a new project

  • Open Xcode
  • File > New Project > Single Page Application
  • ProductName : TriviaApp
  • Language : Swift
  • Device : Iphone
  • Next and save the project

images

images

Grand Central Dispatcher

GCD (Not greatest common divisor) is a technology built by apple for efficiently using the multi core processors on IOS and OS X for improving the performance of the app.

Apps tend to become unresponsive when we perform tasks which take a long time like fetching data from a web server in the main thread. Ideally the main thread should be for handling touch events and reacting to the users events in realtime.

GCD help by pushing the slow running tasks to a background job queue to be executed concurrently without blocking the main thread.

GCD provide abstraction over the thread pool interface and helps user in writing code easily without worrying much about the the concurrency model.

How to fetch data from web using Swift.

There are different ways of fetching the data from the web but in our case we will use NSURLSession.sharedSession() to have singleton object.

A singleton class returns the same instance no matter how many times an application requests it. A typical class permits callers to create as many instances of the class as they want, whereas with a singleton class, there can be only one instance of the class per process. A singleton object provides a global point of access to the resources of its class. Singletons are used in situations where this single point of control is desirable, such as with classes that offer some general service or resource.

Lets implement the randomFact() function which will return a random fact each time we call the function.

randomFact
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
   func randomFact() {
        let baseURL = NSURL(string: "http://numbersapi.com/random/trivia")
        let downloadTask = session.downloadTaskWithURL(baseURL, completionHandler: { (location, response, error) -> Void in
            if(error == nil){
                let objectData = NSData(contentsOfURL: location)
                let tmpData :NSString = NSString(data: objectData, encoding: NSUTF8StringEncoding)

                dispatch_async(dispatch_get_main_queue(), { () -> Void in                    self.factText.text = tmpData
                    if (tmpData.length > 130){
                        self.tweetButtonLabel.hidden = true
                    } else {
                        self.tweetButtonLabel.hidden = false
                    }
                    self.activityIndicator.stopAnimating()
                    self.activityIndicator.hidden = true
                })
            } else {
                let alertViewController = UIAlertController(title: "Error", message: "Couldn't connect to network", preferredStyle: .Alert)
                let okButton = UIAlertAction(title: "OK", style: .Default, handler: nil)
                let cancelButton = UIAlertAction(title: "Cancel", style: .Cancel, handler: nil)
                alertViewController.addAction(okButton)
                alertViewController.addAction(cancelButton)
                self.presentViewController(alertViewController, animated: true, completion: nil)
                dispatch_async(dispatch_get_main_queue(), { () -> Void in
                    self.activityIndicator.stopAnimating()
                    self.activityIndicator.hidden = true
                })
            }

        })
        downloadTask.resume()
    }

These tasks are executed in the background

  • First we create a baseURL which point to the api which return a random fact
  • Using the session object(Shared) we create a downloadTask to fetch the data and pass a completionHandler that will be called when the data is ready to be used
  • CompletionHandler has the following semantics. (location: where the data is stored locally, response : response from the web call, error : in case there are any errors)
  • The first thing we look at is if there is any error, if not we continue using the data by passing it to NSData and eventually to NSString.

These tasks occur in the main thread.

  • Get the reference to the main queue and pass it to the dispatch_async queue
  • Execute necessary code in the main thread for updating the UIView.

Tweet Functionality

The tweet functionality is provided by the Social Framework introduce in IOS. However to use this feature we need to add the Social.Framework to our app. Click on the target button on the left side bar of the xcode.

images
images

The Social framework (Social.framework) provides a simple interface for accessing the user’s social media accounts. This framework supplants the Twitter framework and adds support for other social accounts, including Facebook, Sina Weibo, and others. Apps can use this framework to post status updates and images to a user’s account. This framework works with the Accounts framework to provide a single sign-on model for the user and to ensure that access to the user’s account is approved.

tweet functionality
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
func tweetButton() {
        if SLComposeViewController.isAvailableForServiceType(SLServiceTypeTwitter) {

            var tweetSheet = SLComposeViewController(forServiceType: SLServiceTypeTwitter)
            tweetSheet.setInitialText(factText.text + " #TriviaApp")
            self.presentViewController(tweetSheet, animated: true, completion: nil)
        } else {
            let alertViewController = UIAlertController(title: "Oops", message: "No Twitter Account connected on the device. Go to Settings > Twitter and add a twitter account", preferredStyle: .Alert)
            let okButton = UIAlertAction(title: "OK", style: .Default, handler: nil)
            let cancelButton = UIAlertAction(title: "Cancel", style: .Cancel, handler: nil)
            alertViewController.addAction(okButton)
            alertViewController.addAction(cancelButton)
            self.presentViewController(alertViewController, animated: true, completion: nil)

        }
    }

Clone and run the project

You can find the code on Github. Clone the project and run https://github.com/sarchak/TriviaApp images

I used Treehouse to get started with IOS8 and Swift. I strongly recommend to try it out if you want to get started with IOS8 and Swift. Here are the links Non Affiliate link and Affiliate link

Swift Programming Language

Swift is the new programming language from Apple for developing IOS Apps. I must agree that its lot easier for a newbie to get started with Swift than Objective-c. Here are a few things which I found interesting

Named Parameters

Here are the two ways how you can implement area function

area
1
2
3
4
5
6
7
func area(height : Int,width: Int) -> Int {
  return height * width;
}

let calculatedArea = area(10,20)
println("Area : \(calculatedArea)")
Area : 200

Lets implement the same area function using named parameter

area
1
2
3
4
5
6
func areaNamedParameter(#height : Int,#width: Int) -> Int {
    return height * width;
}
let calculatedAreaParametered = areaNamedParameter(height:50, width:20)
println("Area : \(calculatedAreaParametered)")
Area : 1000

One more example is to return tuple and named tuples

A tuple type is a comma-separated list of zero or more types, enclosed in parentheses. Tuple can be used to return multiple values from a function and we can also name the returned tuple so that we can dereference the value by using the name. Lets write a simple function to convert a hash to a named tuple.

area
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
func hashToTuple(myhash: [String:String]) -> (title:String?, name:String?) {
    return (title: myhash["title"],name:myhash["name"])
}

let data = hashToTuple(["title":"Software Engineer","name":"Shrikar"])

if let title = data.title {
    println(title)
}

if let name = data.name {
    println(name)
}

Software Engineer
Shrikar

let dataPartial = hashToTuple(["title":"Software Engineer"])
if let title = dataPartial.title {
    println(title)
}

if let name = dataPartial.name {
    println(name)
}
Software Engineer

String Optionals

Optionals are an example of the fact that Swift is a type safe language. Swift helps you to be clear about the types of values your code can work with. If part of your code expects a String, type safety prevents you from passing it an Int by mistake. This enables you to catch and fix errors as early as possible in the development process.

Some places optionals are useful:

  • When a property can be there or not there, like middleName or spouse in a Person class
  • When a method can return a value or nothing, like searching for a match in an array
  • When a method can return either a result or get an error and return nothing
  • Delegate properties (which don’t always have to be set)
  • For weak properties in classes. The thing they point to can be set to nil
  • For a large resource that might have to be released to reclaim memory

Lets take an example

area
1
2
3
4
5
6
7
8
func returnOptional(parameter: String) -> String? {
    if parameter == "IOS8"{
        return "Awesome"
    }
    else {
        return nil
    }
}

The above function will return “Awesome” if we pass “IOS8” and nil otherwise.

area
1
2
3
4
5
6
7
let retdata = returnOptional("IOS8")
if let newdata = returnOptional("IOS8") {
    println(newdata)
} else{
    println("Not cool")
}
Awesome

Optional Chaining

Optional chaining is a process for querying and calling properties, methods, and subscripts on an optional that might currently be nil. If the optional contains a value, the property, method, or subscript call succeeds; if the optional is nil, the property, method, or subscript call returns nil. Multiple queries can be chained together, and the entire chain fails gracefully if any link in the chain is nil.

This implementations looks a lot cleaner in converting the returned value to lowercase depending on what optional is returned.

area
1
2
3
4
5
6
let retdata = returnOptional("IOS8")
if let newdata = returnOptional("IOS8")?.lowercaseString {
    println(newdata)
} else{
    println("Not cool")
}

awesome

area
1
2
3
4
5
if let newdata = returnOptional("Some Random OS")?.lowercaseString {
    println(newdata)
} else{
  println("Not cool")
}

Not cool

This is not a complete list of some interesting features so please feel free to comment and I will add those to the list above.

I used Treehouse to learn swift. I strongly recommend to try it out if you want to get started with IOS8 and Swift. Here are the links Non Affiliate link Affiliate link

Skills Required for You to Succeed at a Startup

Angellist is a startup which invest online in new startups. It is also a gold mine for startup jobs ranging from developer to designer on the technical side to VP to sales rep on the business side. I have seen many posts where people ask about the skills which are required to succeed at a startup. In this post lets identify these skill and the roles using angellist api’s.

Lets get started

  • Fetch the jobs data from the angellist api here
  • Store the data in mongodb
  • Run mapreduce or aggregate job to get the top skills and roles.

Tweet: Skills Required for You to Succeed at a Startup http://ctt.ec/cecog+

Tweet: Skills Required for You to Succeed at a Startup http://ctt.ec/cecog+

Key notes

  • Clearly popular startups are using different frameworks like rails,node.js and python which I assume might be Django related.
  • IOS wins over Android when it comes to Mobile apps.
  • Mysql, Mongodb and Redis are the key databases.

Script for fetching the jobs

angel_jobs.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import requests
import pymongo
from pymongo import Connection
import sys

con = Connection("localhost",27017)
angeldb = con.angeldb
jobs = angeldb.jobs

page = 1
while True:

    r = requests.get("https://api.angel.co/1/jobs?&page="+str(page))
    data = r.json();
    for job in data["jobs"]:
        jobs.save(job)
    last_page = int(data["last_page"])
    page = page + 1
    if(last_page == page):
        break;
    print "Curr page till now : "+ str(page) + " Last page : " + str(last_page)

Mapreduce job for finding the top skills

mapreduce.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
var mapFunction = function() {
  for(var idx = 0 ; idx < this.tags.length; idx++){
      if(this.tags[idx].tag_type == "SkillTag"){
          emit(this.tags[idx].name, 1)
      }
  }
};

var reduceFunction = function(tag, valuesPrices) {
  return Array.sum(valuesPrices);
};

db.jobs.mapReduce(
  mapFunction,
  reduceFunction,
  { out: "top_skill" }
)

Mapreduce job for finding the top roles

mapreduce.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
var mapFunction = function() {
  for(var idx = 0 ; idx < this.tags.length; idx++){
      if(this.tags[idx].tag_type == "RoleTag"){
          emit(this.tags[idx].name, 1)
      }
  }
};

var reduceFunction = function(tag, valuesPrices) {
  return Array.sum(valuesPrices);
};

db.jobs.mapReduce(
  mapFunction,
  reduceFunction,
  { out: "top_roles" }
)

Once we run the above mapreduce scripts in the mongodb shell, we will have two collections top_roles & top_skill sorting them by the count value will give us what we want.

While analysing these startups and jobs I found out many of the X for Y startups and will be providing a detailed information about that in the next post stay tuned. Also let me know if you want to see anything else around this data.

Here is a sample

  • ) Uber & AirBnB of Food!
  • ) Uber for Bears
  • ) Uber for CraigsList
  • ) Uber for Healthcare; doctors come to you, anytime, anywhere.
  • ) Uber for Laundry
  • ) Uber for Lawn Mowing connecting the 74B US Lawn Care Market
  • ) Uber for Massage
  • ) Uber for Out of Home Advertising
  • ) Uber for Pizzas
  • ) Uber for Shipping
  • ) Uber for Tech Support
  • ) Uber for Tennis
  • ) Uber for Urban Logistics
  • ) Uber for auto rickshaws
  • ) Uber for boats!
  • ) Uber for career planning & development
  • ) Uber for dog walking
  • ) Uber for everything with the power to choose
  • ) Uber for food & drinks at bars, restaurants, and coffee shops
  • ) Uber for food and drink!
  • ) Uber for food delivery
  • ) Uber for hotels
  • ) Uber for lines
  • ) Uber for moving goods
  • ) Uber for trucking

Tweet: Skills Required for You to Succeed at a Startup http://ctt.ec/cecog+

Trending Topics on Twitter

Twitter is an important platform when it comes finding interesting topics in realtime. One of the interesting project we can build is figuring out these topics in realtime. Lets find out some interesting topics which are connected to given topic. Example is finding out what are people talking about in big data community. This particular problem statement can be applied to many other general problems.

Building the components.

  • The first component is building a component which can fetch the the data from the twitter. Twitter provides a streaming api for fetching data for a given topic which in our case is big data. More info here make sure you have
     track=["bigdata","big data"] 
  • Counting the high frequency words from the data collected. Take a look at NLTK it provides mechanism for tokenizing and also has a frequency distribution mechanism.
  • NLTK also provides a mechanism for ignoring the stopwords like
     a,if,the,was etc 
  • The top keywords in the frequency dist are the most used words and hence suggest importance
  • To help you get started we will provide a python script for fetching data. Make sure you install tweepy like
     sudo easy_install tweepy 
  • Create an app on twitter and insert the necessary consumer_key,consumer_secret, access_token, access_secret.
fetch.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
import json
import sys;
import time;

# Go to http://dev.twitter.com and create an app.
# The consumer key and secret will be generated for you after
consumer_key="Insert consumer key here"
consumer_secret="Insert consumer secret here"

access_token="Insert access token here"
access_token_secret="Insert access token here ";

class StdOutListener(StreamListener):

    """ A listener handles tweets are the received from the stream.
    This is a basic listener that just prints received tweets to stdout.

    """
    def on_data(self, data):
        print data
        return True

    def on_error(self, status):

if __name__ == '__main__':
    l = StdOutListener()
    auth = OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)

    stream = Stream(auth, l)
    stream.filter(track=["big data","bigdata"]);

Voice of Internet : Twilio,ParseApp and Webflow Based Audition Platform

I have been playing with many cloud based platform and some platform really standout from the rest. With these new platforms the time taken to build an app has reduced drastically and its a lot easier to get started as well.

A few platforms which I used to build this app are

Voice Of Internet

Voice of internet is a platform where people can call a number and display their talent within 2 minutes. It can be either singing,playing instrument or whatever you think of. Once the content is PG rated you can vote by sending a sms to the phone number mentioned.

Give it a try here

Voice Of Internet

SmartCopy: Intelligent Layer on Top of Existing Cloud Storage

Simple feature matters

With so many options available in the cloud storage space I am sure everyone uses one or more of those cloud storages(Dropbox/Box/Google Drive etc). One key missing feature is to provide a simple way to exclude files from syncing.

Storage space is not free

Storage space is not free so it really matter what we sync to cloud. Nitpicking individual files to save space is not an easy option, so we tend to copy files which we dont need.

It was not just me who was facing the above mentioned problem there were similar feature request in dropbox forums , box , google drive forums. I wonder why these simple features we ignored anyway enough of nagging lets get to the good part.

Deciding what language or tools to use.

Languages I know : C, C++, Java, Scala, Python After working in C/C++ for a long time I knew managing binaries and shared libraries will be painful hence eliminated them.

Requirements:

  • Should support monitoring directory/file changes.All three languages java,scala and python qualify for this
  • Should be installed by default or installation should be bare minimum. Python is installed by default on all OS and hence a good candidate.
  • Should be unix based system with support for forking. (Thanks for the comment from Nei)

Python it is!!

Design

I followed a similar method to .gitignore and hence decided to have a list of all the pattern that need to be ignored from syncing

Example

  • .*.jar : Ignore all the files containing .jar
  • .class$ : Ignore all the files ending with .class
  • ^Bingo : Ignore all the files starting with Bingo

For more information on using regular expression please check the python regex documentation.

Components

  • ) smartcopyd : SmartCopy Daemon smartcopydaemon monitors for changes to a directory , filter the files according to the ignore patterns and sync’s to the cloud storage.

  • ) smartcopy : SmartCopy Client smartcopy allows you to change the config file and modify any ignore pattern rules.

Possible improvements/features

If you need a feature do tweet. Feature with more tweets or retweets wins and will be implemented next

Github repo : SmartCopy

Learn how to build a game like flappy bird

Docker Nginx and Sentiment Engine on Steroids

Recipe for 74 Million request per day

In the blog post I will explain a battle tested setup which could let you scale http requests upto 860 req/s or a cummulative of 74Million requests per day.

Lets start with our requirements. We needed a low latency sentiment classification engine for serving literally millions of Social Mentions per day. Of late load against sentiment engine cluster has been increasing considerably after Viralheat’s pivot to serve Enterprise customers. The existing infrastructure was not able to handle the new load forcing us to have friday night out to fix it.

Setup

  • ) Nginx running on bare metal
  • ) Sentiment Engine powered by Torando Server in Docker instances. ( Docker version 0.7.5)

In a perfect world the default kernel setting should have worked for any kind of workload but in reality it wont work. The defaults kernel setting are not suitable for high load and are mainly for general purpose networking. In order to serve heavy short lived connections we need to modify/tune certain OS setting along with the tcp settings.

First increase the open file limit

Modify /etc/security/limits.conf to have a high number for open file descriptors. Since every open files takes some OS resources make sure you have sufficient memory don’t blindly increase the open file limits.

/etc/security/limits.conf
1
2
*               soft     nofile          100000
*               hard     nofile          100000

Sysctl Changes

Modify /etc/sysctl.conf to have these parameters.

/etc/sysctl.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
fs.file-max = 100000
net.ipv4.ip_local_port_range = 2000 65000
net.ipv4.tcp_fin_timeout = 5
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_window_scaling = 0
net.ipv4.tcp_sack = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_syn_backlog = 3240000
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_congestion_control = cubic

net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
  • net.ipv4.ip_local_port_range Nginx need to create two connection for every request, one to the client and the other one to the upstream server. So increasing the port range will prevent for port exhaustion.
  • net.ipv4.tcp_fin_timeout The minimum number of seconds that must elapse before a connection in TIME_WAIT state can be recycled. Lowering this value will mean allocations will be recycled faster
  • net.ipv4.tcp_tw_recycle Enables fast recycling of TIME_WAIT sockets. Use with caution and ONLY in internal network where network connectivity speeds are “faster”.
  • net.ipv4.tcp_tw_reuse This allows reusing sockets in TIME_WAIT state for new connections when it is safe from protocol viewpoint. Default value is 0 (disabled). It is generally a safer alternative to tcp_tw_recycle. Note: The tcp_tw_reuse setting is particularly useful in environments where numerous short connections are open and left in TIME_WAIT state, such as web servers. Reusing the sockets can be very effective in reducing server load.

Make sure you run sudo sysctl -p after making modifications to the sysctl.conf.

NGINX Configurations

nginx.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
worker_processes  auto;
worker_rlimit_nofile 96000;
events {
  use epoll;
  worker_connections  10000;
  multi_accept on;
}

http {
        sendfile    on;
        tcp_nopush  on;
        tcp_nodelay on;

        reset_timedout_connection on;
}

upstream sentiment_server{
        server server0:9000;
        server server1:9001;
        server server2:9002;
        server server3:9003;
        server server4:9004;
        server server5:9005;
        server server6:9006;
        server server7:9007;
        server server8:9008;
        server server9:9009;
        server server10:9010;
        server server11:9011;
      keepalive 512;
}

server {
  server_name serverip;
  location / {

    proxy_pass http://senti_server;
    proxy_set_header   Connection "";
    proxy_http_version 1.1;
    break;
  }
}
  • worker_processes defines the number of worker processes that nginx should use when serving your website. The optimal value depends on many factors including (but not limited to) the number of CPU cores, the number of hard drives that store data, and load pattern. When in doubt, setting it to the number of available CPU cores would be a good start (the value “auto” will try to autodetect it).
  • worker_rlimit_nofile changes the limit on the maximum number of open files for worker processes. If this isn’t set, your OS will limit. Chances are your OS and nginx can handle more than ulimit -n will report, so we’ll set this high so nginx will never have an issue with “too many open files”.
  • worker_connections sets the maximum number of simultaneous connections that can be opened by a worker process. Since we bumped up worker_rlimit_nofile, we can safely set this pretty high.

References

Docker for sentiment engine.

Our sentiment engine runs inside a docker container which helps us in iterating and deploying new models fast. Our initial assumption was that running inside a docker would have performance overhead but it wasn’t. We tuned our container with similar configurations as the base machine. The sysctl.conf inside the container was almost similar to the host machine.

A good addition to the backend infrastructure would be some kind of a intelligent container which can look at the load and scale up or scale down the sentiment engine instances. This can be easily done as docker exposes a REST API to create and destroy the container on the fly. If you like interested with the work we do check our careers pageViralheat Careers

FYI Please do not copy paste these setting and assume it will work automatically. There are many variable like the server machine memory, cpu etc. This guide should be used to help you with tuning.

Daily Commute and Coursera Course Completion Relationship - My View

First part of my Story:

I commute daily from Santa Clara to San Mateo and have been doing this for almost 15months. Anyone who travels on Freeway 101 will agree with me that traffic sucks. There is no predictable way of finding out as to when 101 can be free. I tried starting at different time but still not able to find one time which works. If I am really really lucky then I reach office in 35mins, but 90% of the time it so happens that the commute is anywhere between 45 mins to 1:30mins (one way). I would say the average travel time is 1hr (one way). This travel comes with additional member which joins the party “STRESS”. Its quite common to see a few accidents daily on 101. I recently met with a terrible accident where a guy came and hit my car from behind. I believe these accidents are mainly caused by using mobile phones, but I can’t even blame them since the travel time itself is so bad that they need something to keep them occupied.

Second part of my Story:

I like to keep up with the current trends in technology and have been taking Coursera courses from the day one. Initally my office was near my house and I was able to complete the courses after going back home. After joining the new company I have noticed that my completion rate of the courses have gone down significantly.I tried completing Functional Programming in Scala last time but couldn’t . By the time I went home I was so tried that my enthusiasm for learning new stuff had decreased considerably. My usage of laptop was restricted to checking mails, monitor and fix issues with the production systems if any and exploring new stuff related to work.

Third part of my Story:

I always wanted to travel by public transport to avoid this traffic but there was one constraint that prevented me from doing that .The connecting links between VTA, Caltrain and the shuttles. If I didnt want to waste time waiting between the connecting links I should start from my home at 7:20 to 7:30am. Due to my recent accident my car is currently in body shop for repair. I guess this was the right time for me to try public transport. The travel time itself has not reduced for me but I could utilize that time since I am not driving anymore. I could listen to music, browse, watch videos, or program. I started watching Coursera video and completing the assignments during this time. The result have been awesome so far, I am in my last week of Functional programming in Scala and hopefully will complete it this time.

If you are taking Coursera Classes and have not been completing the course see if your pattern matches that of mine :).

PlayFramework SecureSocial and MongoDB

In this blog post we will cover on how to integrate SecureSocial into a PlayFramework Application. SecureSocial is an authentication module for Play Framework applications supporting OAuth, OAuth2, OpenID, Username/Password and custom authentication schemes. SecureSocial has an example where the tokens and users are all stored in memory , but to make it a bit more interesting we will store all the users into MongoDB using Play ReactiveMongo Plugin.

Lets get started.

Installation

Add the required dependencies to the projects/Build.scala

project/Build.scala
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
object ApplicationBuild extends Build {

  val appName         = "crowdsource"
  val appVersion      = "1.0-SNAPSHOT"

  val appDependencies = Seq(
    // Add your project dependencies here,
    jdbc,
    anorm,
    "org.reactivemongo" %% "play2-reactivemongo" % "0.9",
    "securesocial" %% "securesocial" % "2.1.1"
  )


  val main = play.Project(appName, appVersion, appDependencies).settings(
    resolvers += Resolver.url("sbt-plugin-snapshots", url("http://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/"))(Resolver.ivyStylePatterns)
  )

}  

ReactiveMongo Configuration

Create a file in conf/play.plugins

conf/play.plugins
1
400:play.modules.reactivemongo.ReactiveMongoPlugin

MongoDB configuration in conf/application.conf

conf/application.conf
1
2
mongodb.servers = ["localhost:27017"]
mongodb.db = "crowdsource"

Secure Social Configuration

Modifying routes

SecureSocial relys on theses routes to be available for the application.

conf/routes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Login page
GET     /login                      securesocial.controllers.LoginPage.login
GET     /logout                     securesocial.controllers.LoginPage.logout

# User Registration and password handling (only needed if you are using UsernamePasswordProvider)
GET     /signup                     securesocial.controllers.Registration.startSignUp
POST    /signup                     securesocial.controllers.Registration.handleStartSignUp
GET     /signup/:token              securesocial.controllers.Registration.signUp(token)
POST    /signup/:token              securesocial.controllers.Registration.handleSignUp(token)
GET     /reset                      securesocial.controllers.Registration.startResetPassword
POST    /reset                      securesocial.controllers.Registration.handleStartResetPassword
GET     /reset/:token               securesocial.controllers.Registration.resetPassword(token)
POST    /reset/:token               securesocial.controllers.Registration.handleResetPassword(token)
GET     /password                   securesocial.controllers.PasswordChange.page
POST    /password                   securesocial.controllers.PasswordChange.handlePasswordChange


# Providers entry points
GET     /authenticate/:provider     securesocial.controllers.ProviderController.authenticate(provider)
POST    /authenticate/:provider     securesocial.controllers.ProviderController.authenticateByPost(provider)
GET     /not-authorized             securesocial.controllers.ProviderController.notAuthorized   

Append to the conf/play.plugins.

In this application we will see how to use the username and password based authentication provided by secure social hence we need to make sure those plugins are properly configured.

conf/play.plugins
1
2
3
4
5
6
7
8
9
400:play.modules.reactivemongo.ReactiveMongoPlugin
1500:com.typesafe.plugin.CommonsMailerPlugin
9994:securesocial.core.DefaultAuthenticatorStore
9995:securesocial.core.DefaultIdGenerator
9996:securesocial.core.providers.utils.DefaultPasswordValidator
9997:controllers.plugin.MyViews
9998:service.MongoUserService
9999:securesocial.core.providers.utils.BCryptPasswordHasher
10004:securesocial.core.providers.UsernamePasswordProvider 

For secure social to work we need to make sure we implement the UserService in our case the service.MongoUserService entry for 9998. This is the component which will store the user data, tokens in MongoDB and retrieve when required.

MongoUserService
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
package service

import _root_.java.util.Date
import securesocial.core._
import play.api.{Logger,Application}
import securesocial.core.providers.Token
import play.api.libs.json._
import play.api.libs.json.Reads._
import play.api.libs.json.Writes._
import securesocial.core.IdentityId
import securesocial.core.providers.Token
import play.modules.reactivemongo.MongoController
import play.api.mvc.Controller
import play.modules.reactivemongo.json.collection.JSONCollection
import scala.concurrent.Await
import scala.concurrent.duration._
import reactivemongo.core.commands.GetLastError
import scala.util.parsing.json.JSONObject
import org.joda.time.DateTime
import org.joda.time.format.{DateTimeFormatter, DateTimeFormat}

class MongoUserService(application: Application) extends UserServicePlugin(application) with Controller with MongoController{
  def collection: JSONCollection = db.collection[JSONCollection]("users")
  def tokens: JSONCollection = db.collection[JSONCollection]("tokens")
  val outPutUser = (__ \ "id").json.prune

  def retIdentity(json : JsObject) : Identity = {
    val userid = (json \ "userid").as[String]

    val provider = (json \ "provider").as[String]
    val firstname = (json \ "firstname").as[String]
    val lastname = (json \ "lastname").as[String]
    val email = (json \ "email").as[String]
    val avatar = (json \ "avatar").as[String]
    val hash = (json \ "password" \ "hasher").as[String]
    val password = ( json \ "password" \ "password").as[String]
    println("password : "+ password)
    val salt = (json \ "password" \ "salt").asOpt[String]
    val authmethod = ( json \ "authmethod").as[String]

    val identity : IdentityId = new IdentityId(userid,authmethod)
    val authMethod : AuthenticationMethod = new AuthenticationMethod(authmethod)
    val pwdInfo: PasswordInfo = new PasswordInfo(hash,password)
    val user : SocialUser = new SocialUser(identity,firstname,lastname,firstname,Some(email),Some(avatar),authMethod,None,None,Some(pwdInfo))
    user
  }

  def findByEmailAndProvider(email: String, providerId: String): Option[Identity] = {
    val cursor  = collection.find(Json.obj("userid"->email,"provider"->providerId)).cursor[JsObject]
    val futureuser = cursor.headOption.map{
      case Some(user) => user
      case None => false
    }
    val jobj = Await.result(futureuser, 5 seconds)
    jobj match {
      case x : Boolean => None
      case _  => Some(retIdentity(jobj.asInstanceOf[JsObject]))

    }
  }

  def save(user: Identity): Identity = {

    val email = user.email match {
      case Some(email) => email
      case _ => "N/A"
    }

    val avatar = user.avatarUrl match{
      case Some(url) => url
      case _ => "N/A"
    }

    val savejson = Json.obj(
      "userid" -> user.identityId.userId,
      "provider" -> user.identityId.providerId,
      "firstname" -> user.firstName,
      "lastname" -> user.lastName,
      "email" -> email,
      "avatar" -> avatar,
      "authmethod" -> user.authMethod.method,
      "password" -> Json.obj("hasher" -> user.passwordInfo.get.hasher, "password" -> user.passwordInfo.get.password, "salt" -> user.passwordInfo.get.salt),
      "created_at" -> Json.obj("$date" -> new Date()),
      "updated_at" -> Json.obj("$date" -> new Date())
    )
    println(Json.toJson(savejson))
    collection.insert(savejson)
    user
  }

  def find(id: IdentityId): Option[Identity] = {
   findByEmailAndProvider(id.userId,id.providerId)
  }

  def save(token: Token) {
    val tokentosave = Json.obj(
      "uuid" -> token.uuid,
      "email" -> token.email,
      "creation_time" -> Json.obj("$date" -> token.creationTime),
      "expiration_time" -> Json.obj("$date" -> token.expirationTime),
      "isSignUp" -> token.isSignUp
    )
    tokens.save(tokentosave)
  }



  def findToken(token: String): Option[Token] = {

     val cursor  = tokens.find(Json.obj("uuid"->token)).cursor[JsObject]
      val futureuser = cursor.headOption.map{
        case Some(user) => user
        case None => false
     }
      val jobj = Await.result(futureuser, 5 seconds)
      jobj match {
        case x : Boolean => None
        case obj:JsObject  =>{
          println(obj)
          val uuid = ( obj \ "uuid").as[String]
          val email = (obj \ "email").as[String]
          val created = (obj \ "creation_time" \ "$date").as[Long]
          val expire = (obj \ "expiration_time" \ "$date").as[Long]
          val signup = (obj \ "isSignUp").as[Boolean]
          val df = DateTimeFormat.forPattern("yyyy-MM-dd HH:mm:ss")
          Some(new Token(uuid,email,new DateTime(created),new DateTime(expire),signup))
        }
      }
  }

  def deleteToken(uuid: String) {}

  def deleteExpiredTokens() {}
}  

The above code implements the MongoControllers which provides with helpers to interact with MongoDB as Json Documents instead of BSONDocuments.

This would be our simple SecureSocial Application Controller where the index action need to be authenticated.

Application.scala
1
2
3
4
5
object Application extends Controller  with SecureSocial{
  def index = SecuredAction { implicit  request =>
    Ok(views.html.index(request.user))
  }
}

All the necessary code can be found on Github

Machine Learning Playground Using Docker

Recently I have seen a lot of interest by people to learn machine learning particularly machine learning in Python. Python has some of the awesome tools for getting started with Machine learning but the problem is getting all the necessary installation is painful. This is one of the reason people lose interest in getting started :).

If you guys are familiar with docker then I have created a image which has all the necessary packages installed

  • ) Numpy
  • ) Scipy
  • ) Scikit-learn
  • ) Matplotlib
  • ) Pandas.

All you need to do is docker pull shrikar/machinelearning

And if you dont know about docker go to Docker and get started. Since docker needs lxc features of Linux kernel it will be easier for you guys to just get a simple 5$ server DigitalOcean install docker and pull the image. Here is a tutorial on how to install docker on Digital Ocean Install docker on Digital Ocean