Furthermore on Parsing Text Files in C++

C++ is not really designed to handle text parsing. It is not the ideal language to manipulate any data input in some aspect. Let us look at the stock price example below, assume we somehow get some stock data from Nasdaq's stock, Facebook, Apple, and Amazon as below.

Demo data

Let say want want to parse data into the format below from above data

'FB' has a price of '273.16' USD 
'AAPL' has a price of '132.69' USD
'AMZN' has a price of '3256.93' USD

Consider the following code and result.

'FB' has a price of '273.16' USD 
'
AAPL' has a price of '132.69' USD
'
AMZN' has a price of '3256.93' USD
'' has a price of '3256.93' USD

From the result, we received the blank line and why does this happen?

This is because, at the end of every line of text that read by fstream, we will receive an invisible non-printing new line that has the effect of pulling a character following it on a new line. One of the ways is to use .get() to get a new line character.

💡 In C++ 11, we can futher use ws to read whatever white space is after. .get() is not ideal when compare with ws as what if there is more whitespace character or something, where .get() only allows 1 whitespace.

Now the issue is not yet solved, the output from the console is following.

'FB' has a price of '273.16' 
USD 'AAPL' has a price of '132.69'
USD 'AMZN' has a price of '3256.93' USD
'' has a price of '3256.93' USD

We still have to handle the last line which is extra and duplicated from the price of Amazon. Again, why is that happening?

It is because when the iteration tries to read the 4th line (which is nothing, characters does not exist), the getline() cannot further read the file and at the end of the iteration, we still print it out as our answer. In order to solve the issue, a simple condition check is fine. Consider the following code.

Now the result is what we expected.

'FB' has a price of '273.16' USD 
'AAPL' has a price of '132.69' USD
'AMZN' has a price of '3256.93' USD

Conclusion

We have to be very careful when we parse any text files as string input using any C++ file stream classes. I have mentioned some common situation that might face when you tried to input any file data such as the default newline character and how to deal with it.

An ordinary developer record his life and learning progress