Machine Learning opencv Self Driving car

Self Driving Car: Lane Detection

First project for the self driving car nanodegree involves identifying lanes in images and video stream. We will be using opencv a popular image processing library in python.

Step involved in the image processing pipeline

1) Read image

First step is to read the image using matplotlib’s imread method. This method takes in the filename and converts the image into a 3 dimensional numpy array. You might be wondering why 3 dimensional array? These 3 dimensions refer to the RGB channels.

    image = mpimg.imread('test_images/solidWhiteRight.jpg')
    print('This image is:', type(image), 'with dimesions:', image.shape)

The result of the above code would be

This image is: &ltclass 'numpy.ndarray'&gt with dimesions: (540, 960, 3)

If you look at the result we can see that the size of the image is 540 height and 960 width.

Caution: Here the first col refers to the height which might be a bit confusing. In a normal grid we consider x to be the first column and y to be the second column. Another thing to note is the origin in image processing world is top left unlike bottom left in other circumstances.

If we consider [0][0] representing the 0th pixel then [0][0][0],[0][0][1] and [0][0][2] represent the Red, green and blue channel values.

2) Convert to grayscale

Converting to gray scale can be easily done using opencv. If you look at the shape of the image after gray scaling you can see that the shaped changed from (540, 960, 3) to (540, 960)

grayscaled = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
This image is: &ltclass 'numpy.ndarray'&gt with dimesions: (540, 960, 1)

3) Gausian Blur

Most edge-detection algorithms are sensitive to noise, in order to reduce the noise we will apply gausian blur to reduce to noise and smooth the image. Play around with different values for kernel density.

def gaussian_blur(img, kernel_size):
    """Applies a Gaussian Noise kernel"""
    return cv2.GaussianBlur(img, (kernel_size, kernel_size), 0)

4) Region of Interest.

If you look at the image we could possible consider only our current lane by masking out the areas which are outside the current boundary. I used a simpler approach for region of interest.

def region_of_interest(img, vertices):
    Applies an image mask.
    Only keeps the region of the image defined by the polygon
    formed from `vertices`. The rest of the image is set to black.
    #defining a blank mask to start with
    mask = np.zeros_like(img)   
    #defining a 3 channel or 1 channel color to fill the mask with depending on the input image
    if len(img.shape) > 2:
        channel_count = img.shape[2]  # i.e. 3 or 4 depending on your image
        ignore_mask_color = (255,) * channel_count
        ignore_mask_color = 255
    #filling pixels inside the polygon defined by "vertices" with the fill color    
    cv2.fillPoly(mask, vertices, ignore_mask_color)
    #returning the image only where mask pixels are nonzero
    masked_image = cv2.bitwise_and(img, mask)
    return masked_image

Here are my selection of vertices for ROI

max_width = image.shape[1]
max_height = image.shape[0]
width_delta = int(max_width/20)
vertices = np.array([[(100, max_height), (max_width -100, max_height), (max_width/2 + width_delta, max_height/2 + 50), (max_width/2 - width_delta, max_height/2 + 50)]], np.int32)

Key points to remember

  • If you run just the plain canny edge and hough transforms its quite possible that the lane marking won’t be as smooth as shown in the video. One way to fix that is possible with this approach
    Find the slope and intercept of the line using the lines detected by hough transforms
    Maintain a rolling window average of the last 20 frames for intercept, slope and also the top y position.
  • For the challenge video you might find it that it’s hard to convert identify the yellow lane boundary on the concrete road making it hard to identify lines and extrapolate. The way I got around this problem was by converting the yellow lane to white lane.
def yellow_to_white(img):
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    upper_yellow = np.array([100, 255, 255]) 
    lower_yellow = np.array([80, 100, 100]) 
    mask_inverse = cv2.inRange(hsv, lower_yellow, upper_yellow)    
    masked_replace_white = cv2.addWeighted(img, 1, \
                                       cv2.cvtColor(mask_inverse, cv2.COLOR_GRAY2RGB), 1, 0)
    return masked_replace_white


Once we find the intercept, slope and the top y position extrapolation would become very simple. I won’t provide the whole code here as that is the challenge of the first project but will certainly provide some hints

Rolling slopes and intercepts : Take a look at np.linalg.lstsq in scipy library which will help you calculate the slope and intercept.

def lines_linreg(lines_array):  
        Calculates the slope and intecept for line segments
        ### Select the 0th and 2nd index which will provide the xval and reshape to extract x values
        x = np.reshape(lines_array[:, [0, 2]], (1, len(lines_array) * 2))[0]
        ### Select the 1st and 3rd index which will provide the yval and reshape to extract 7 values
        y = np.reshape(lines_array[:, [1, 3]], (1, len(lines_array) * 2))[0]

        A = np.vstack([x, np.ones(len(x))]).T
        m, c = np.linalg.lstsq(A, y)[0]
        x = np.array(x)
        y = np.array(x * m + c).astype('int')
    return x, y, m, c

#Snippet of the extrapolate method    
    slopes = np.apply_along_axis(lambda row: (row[3] - row[1])/(row[2] - row[0]), 2, lines)
    pos_slopes = slopes > 0.50
    pos_lines = lines[pos_slopes]

    neg_slopes = slopes < -0.50
    neg_lines = lines[neg_slopes]
    if len(pos_lines) == 0 or len(neg_lines) == 0:
    pos_x, pos_y, pos_m, pos_c = lines_linreg(pos_lines)
    pos_global_slopes = np.append(pos_global_slopes, [pos_m])
    pos_global_intercepts = np.append(pos_global_intercepts, [pos_c])

    pos_m = pos_global_slopes[-20:].mean()
    pos_c = pos_global_intercepts[-20:].mean()

Similary you can find the slope and intercept for the right lines. I need to find the bottom left (x,y), top left(x,y), top right(x,y) and bottom right(x,y) to extrapolate the lines. The below code will help you find the bottom left and the top left co-ordinates. I find the min top for between positive and negative lines and then keep a rolling average for the same. This will make the lines much more smoother. If you are seeing a lot of flickering / jittery video then this trick will help you solve it.

    bottom_left_y = img.shape[0]
    bottom_left_x = int((bottom_left_y - neg_c)/neg_m)
    min_top_y = np.min([neg_y.min(), pos_y.min()])
    top_y = np.append(top_y, [min_top_y])
    min_top_y = int(top_y[-20:].mean())
    top_left_y = min_top_y
    top_left_x = int((top_left_y - neg_c)/neg_m)

Once you find the co-ordinates drawing the lines should be very simple with cv2.line method. Full code can be found hereĀ

About the author


Backend/Infrastructure Engineer by Day. iOS Developer for the rest of the time.

/* ]]> */