Recently, I have been working on a just-for-fun project for myself. I am creating a sudoku web application. It is built on a Ruby on Rails back-end and a React front-end.
I found an open source dataset of one million sudoku puzzles and solutions that I used to seed my back-end. To simply matters, I only used 5000 puzzles/solutions to seed the database.
The open source dataset I found was in a CSV file with two columns. The first column was the puzzle, and the second was the solution.
What is a CSV file?
“A CSV file is a Comma Separated Values file. All CSV files are plain text files, can contain numbers and letters only, and structure the data contained within them in a tabular, or table, form.” (What is a CSV file?)
How to read and access the CSV file?
In order to seed the database with the information from the CSV file, you will need to know the file path where your CSV file is located.
To me it made the most sense to place by CSV file in the
lib folder in my Rails API project directory. You may place your file where it makes most sense to you.
In Ruby, you can import your CSV file all at once (which stores all of the file content in memory) or read from the file one row at a time. I chose to do so by reading from the file one row at a time using the
.foreach method. The alternate option would be to use the
In your seed file you will need to use the following code:
CSV.foreach(Rails.root.join('lib/sudoku_seeds.csv'), headers: true) do |row|
In the first line, I
require 'csv' which allows me to work with CSV files in Ruby. This is true because CSV is a part of the Ruby Standard Library.
Accessing your file with its path
Before seeing what
CSV.foreach does, let’s examine what is happening inside of the parenthesizes.
Attribute 1: I know that local route to my file is:
'lib/sudoku_seeds.csv'. Replace this with the file path to your CSV file.
But what is
Rails.root.join(‘lib/sudoku_seeds.csv’) doing? If you run
Rails.root in your terminal while in the directory you are trying to access the return value is the path to your current directory. For example, I ran
Rails.root in my rails console to test it.
Now, when I add
.join(‘lib/sudoku_seeds.csv’) to the earlier returned route to my local directory my return value is the exact path to the CSV document holding my puzzles and solutions.
Attribute 2: The second attribute of
headers: true, means that the first row of my CSV file has column names. If your CSV file does not have column names in the first row, then you do not need to include this attribute.
Knowing that we have passed the file path and an option of headers as arguments to the
CSV.foreach method, we also pass a block to our method. The block is everything between the
end keywords. In the example that is this code:
CSV.foreach method passes each row of your CSV file to the block of code. In my case, I call my
Board model and create a new board instance for each row.
CSV.foreach treats each row as an array, and each column in a row is accessed by calling its position in the array. For example, my CSV file only had two columns: the incomplete puzzle, and the correct solution.
Board is assigned a puzzle attribute by accessing the first column with the code
row, and the solution attribute is assigned by accessing the second column
row. This would continue for as many columns as you need by incrementing the the value
You’ve now successfully seeded your database
Following those steps will allow you to use this method to seed your database with CSV file. If you want to dive into this topic in more detail, Darko Gjorgjievski has a great tutorial broken into two parts. Part1 and Part 2 can be found by following those links.
Once you have seeded your database, be sure to test your data and make sure that everything is performing as expected!