Parsing a CSV File in Ruby

Vakas Akhtar
The Startup
Published in
3 min readFeb 2, 2021

--

CSV stands for comma separated values which is commonly seen in excel spreadsheets. While on a program like excel the data will appear with columns and rows inside its own cells, once viewed as raw code the CSV file will look something like this.

"name, age, gender"
"Bill, 24, male"
"Ana, 20, female"

Ruby has a built in library that allows us to do many things with CSV files. To begin go ahead and create a new file and add

require 'csv' 

at the top. Now that you can use an entire set of methods to do what you’d like to interact with your csv file. To simply read your csv file you can go ahead and enter the following code.

require 'csv'
CSV.read("MY_FILE.csv")

or you can parse it

require 'csv'
CSV.parse("MY_FILE.csv")

These methods however do not account for the fact that our CSV file has headers. In order to include our headers we can go with another approach. These methods are also no good if our file was large, say greater than 10 MB. To bypass these issues the best approach would be to use a foreach method and with a block and read the file line by line. This way the program will be faster and use less memory. This approach would look something like this.

require 'csv'CSV.foreach(("MY_FILE.csv"), headers: true, col_sep: ",") do |row|
puts row
end

Inside the foreach method we include parameters such as the file path along with items such as the headers and column separators. By declaring headers true the program will recognize the first row as headers and you can reference them as you iterate through each row. The column separator is exactly what it sounds like. This is the character that separates each column. In our case the columns are separated by commas but it is possible that our file could be separated by something else like pipes in which case our col_sep would look like this

col_sep: "|"

Now you can go ahead and play around with this script and I suggest viewing each row by column name or index like so to see what your code will output to get a better idea of how things are working.

require 'csv'CSV.foreach(("MY_FILE.csv"), headers: true, col_sep: ",") do |row|
puts row[0]
puts row["age"]
end

From here you can get creative and do whatever you’d like with your CSV file. Go head and try iterating through each row and adding some data into an array while the foreach method runs through your CSV file.

require 'csv'array = []CSV.foreach(("MY_FILE.csv"), headers: true, col_sep: ",") do |row|
array << row[0]
end
puts array

Now you have an array with all the data from your first column. With this array we could now go ahead and create a new CSV file.

require 'csv'array = []CSV.foreach(("MY_FILE.csv"), headers: true, col_sep: ",") do |row|
array << row[0]
end
CSV.open('NEW.csv', 'wb') do |csv|
csv << ["name"]
array.length.times do |i|
csv << [array[i]]
end
end

With the method CSV.open, the first argument is the name of the csv file that will be outputted and the second parameter ‘wb’ indicates that the program will write a new csv file. From here we create our first line which is a header called name. We shovel that in to the variable within our block called csv and add that to our table. Next we loop through the array of all the names we collected earlier and add them into the csv. We run the loop according to how large our array is. After this you can look at your project directory and you’ll see your new CSV file outputted.

To close, there are many ways you can manipulate your CSV files and how you’d like to output them. By using what was taught in this article you could now go ahead and maybe parse a CSV file and use that data to seed your database in your rails API. Or you could even convert your parsed data in JSON format. The possibilities are endless.

--

--