First project for the self driving car nanodegree involves identifying lanes in images and video stream. We will be using opencv a popular image processing library in python.
Step involved in the image processing pipeline
1) Read image
First step is to read the image using matplotlib’s imread method. This method takes in the filename and converts the image into a 3 dimensional numpy array. You might be wondering why 3 dimensional array? These 3 dimensions refer to the RGB channels.
image = mpimg.imread('test_images/solidWhiteRight.jpg') print('This image is:', type(image), 'with dimesions:', image.shape) plt.imshow(image)
The result of the above code would be
This image is: <class 'numpy.ndarray'> with dimesions: (540, 960, 3)
If you look at the result we can see that the size of the image is 540 height and 960 width.
Caution: Here the first col refers to the height which might be a bit confusing. In a normal grid we consider x to be the first column and y to be the second column. Another thing to note is the origin in image processing world is top left unlike bottom left in other circumstances.
If we consider [0][0] representing the 0th pixel then [0][0][0],[0][0][1] and [0][0][2] represent the Red, green and blue channel values.
2) Convert to grayscale
Converting to gray scale can be easily done using opencv. If you look at the shape of the image after gray scaling you can see that the shaped changed from (540, 960, 3) to (540, 960)
grayscaled = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) This image is: <class 'numpy.ndarray'> with dimesions: (540, 960, 1)
3) Gausian Blur
Most edge-detection algorithms are sensitive to noise, in order to reduce the noise we will apply gausian blur to reduce to noise and smooth the image. Play around with different values for kernel density.
def gaussian_blur(img, kernel_size): """Applies a Gaussian Noise kernel""" return cv2.GaussianBlur(img, (kernel_size, kernel_size), 0)
4) Region of Interest.
If you look at the image we could possible consider only our current lane by masking out the areas which are outside the current boundary. I used a simpler approach for region of interest.
def region_of_interest(img, vertices): """ Applies an image mask. Only keeps the region of the image defined by the polygon formed from `vertices`. The rest of the image is set to black. """ #defining a blank mask to start with mask = np.zeros_like(img) #defining a 3 channel or 1 channel color to fill the mask with depending on the input image if len(img.shape) > 2: channel_count = img.shape[2] # i.e. 3 or 4 depending on your image ignore_mask_color = (255,) * channel_count else: ignore_mask_color = 255 #filling pixels inside the polygon defined by "vertices" with the fill color cv2.fillPoly(mask, vertices, ignore_mask_color) #returning the image only where mask pixels are nonzero masked_image = cv2.bitwise_and(img, mask) return masked_image
Here are my selection of vertices for ROI
max_width = image.shape[1] max_height = image.shape[0] width_delta = int(max_width/20) vertices = np.array([[(100, max_height), (max_width -100, max_height), (max_width/2 + width_delta, max_height/2 + 50), (max_width/2 - width_delta, max_height/2 + 50)]], np.int32)
Key points to remember
- If you run just the plain canny edge and hough transforms its quite possible that the lane marking won’t be as smooth as shown in the video. One way to fix that is possible with this approach
Find the slope and intercept of the line using the lines detected by hough transforms
Maintain a rolling window average of the last 20 frames for intercept, slope and also the top y position. - For the challenge video you might find it that it’s hard to convert identify the yellow lane boundary on the concrete road making it hard to identify lines and extrapolate. The way I got around this problem was by converting the yellow lane to white lane.
def yellow_to_white(img): hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) upper_yellow = np.array([100, 255, 255]) lower_yellow = np.array([80, 100, 100]) mask_inverse = cv2.inRange(hsv, lower_yellow, upper_yellow) masked_replace_white = cv2.addWeighted(img, 1, \ cv2.cvtColor(mask_inverse, cv2.COLOR_GRAY2RGB), 1, 0) return masked_replace_white
Extrapolate
Once we find the intercept, slope and the top y position extrapolation would become very simple. I won’t provide the whole code here as that is the challenge of the first project but will certainly provide some hints
Rolling slopes and intercepts : Take a look at np.linalg.lstsq in scipy library which will help you calculate the slope and intercept.
def lines_linreg(lines_array): ''' Calculates the slope and intecept for line segments ''' try: ### Select the 0th and 2nd index which will provide the xval and reshape to extract x values x = np.reshape(lines_array[:, [0, 2]], (1, len(lines_array) * 2))[0] ### Select the 1st and 3rd index which will provide the yval and reshape to extract 7 values y = np.reshape(lines_array[:, [1, 3]], (1, len(lines_array) * 2))[0] A = np.vstack([x, np.ones(len(x))]).T m, c = np.linalg.lstsq(A, y)[0] x = np.array(x) y = np.array(x * m + c).astype('int') except: print(x,y) return x, y, m, c #Snippet of the extrapolate method slopes = np.apply_along_axis(lambda row: (row[3] - row[1])/(row[2] - row[0]), 2, lines) pos_slopes = slopes > 0.50 pos_lines = lines[pos_slopes] neg_slopes = slopes < -0.50 neg_lines = lines[neg_slopes] if len(pos_lines) == 0 or len(neg_lines) == 0: return; pos_x, pos_y, pos_m, pos_c = lines_linreg(pos_lines) pos_global_slopes = np.append(pos_global_slopes, [pos_m]) pos_global_intercepts = np.append(pos_global_intercepts, [pos_c]) pos_m = pos_global_slopes[-20:].mean() pos_c = pos_global_intercepts[-20:].mean()
Similary you can find the slope and intercept for the right lines. I need to find the bottom left (x,y), top left(x,y), top right(x,y) and bottom right(x,y) to extrapolate the lines. The below code will help you find the bottom left and the top left co-ordinates. I find the min top for between positive and negative lines and then keep a rolling average for the same. This will make the lines much more smoother. If you are seeing a lot of flickering / jittery video then this trick will help you solve it.
bottom_left_y = img.shape[0] bottom_left_x = int((bottom_left_y - neg_c)/neg_m) min_top_y = np.min([neg_y.min(), pos_y.min()]) top_y = np.append(top_y, [min_top_y]) min_top_y = int(top_y[-20:].mean()) top_left_y = min_top_y top_left_x = int((top_left_y - neg_c)/neg_m)
Once you find the co-ordinates drawing the lines should be very simple with cv2.line method. Full code can be found hereĀ https://github.com/sarchak/self_driving_car