Read a text file line by line

There are lots of times when you may need to read your text file line by line. Each line of the text file can then be processed, depending on what you want to do with it. For example, in our postcode text file, we have the geographical areas first and then the code:

A text file that needs parsing line by line

What we'd like to do, however, is to have the code first and then the area. This means it will sorted alphabetically. So we want it like this:

AB Aberdeen
AL St Albans
B Birmingham
BA Bath

What we can do is to open up our text file, read each line one at a time, and then pass the line to a function so we can reverse the order.

We can adapt our openfile function from a previous lesson. At the moment, it's this for the if statement part:

if filename:

the_file = open(filename)
textArea.delete('1.0', tk.END)
textArea.insert(tk.END, the_file.read())
the_file.close()

Once you have created a file object, you can use the inbuilt functions to manipulate that file object. Ours is called the_file. One of the functions that you can use is called readlines (there also one called readline, singular):

the_file.readlines()

The readlines function, as its name suggests, reads each line of your text file. You can use it in a loop to go through all the lines of the text file:

for line in the_file.readlines():

The variable called line above will change each time round the loop. It will contain a single line from your file. The loop ends when there are no more lines to be read.

You can hand the line variable over to a function:

text_line = parseline(line)

The parseline function, which we haven't yet created, can then be used to do something with your line of text.

Whatever that 'something' is that you want to do with your line can be returned. We're returning a value from the parseline function and placing that value into a new variable that we've called text_line. (Remember: a function on the right of an equals sign is one that returns a value.)

Finally, we can put the newly-arranged text line into the text area:

textArea.insert(tk.END, text_line + '\n')

The + '\n' at the end gets us a new line after every text line.

The Parse Line function

For us, we'd like each line to go from this format:

Aberdeen AB

to this:

AB, Aberdeen

Here, we've put the code at the start and the geographical area at the end. We'd like a comma to separate the two. So we need some string manipulation in our function. We'll set up a new function called parseline:

def parseline(the_line):

As the first indented line of the function, we can get rid of any trailing white space:

parsed_line = the_line.strip()

If you look at the postcodes file, you'll see a lot of the text lines have more than one space in them:

East London E
East Central London EC

The first one above has two spaces while the second one has three. What we can do is to find out where the last of these spaces are. We can then use this information with string slicing. (You learned about string slicing in an earlier section.)

A useful string function for our purposes is one called rfind. You used find to find the first position of a character. For example:

str = "This is some text"
space_pos = str.find(' ')

This code will find the first space, the one between the words 'This' and 'is'. It will give you the position number of the space. The rfind function tells you where in your string the final character is. So this:

str = "This is some text"
space_pos = str.rfind(' ')

will get you the position of the final space, the one between the words 'some' and 'text'. The second line of function, then, will be this:

space_pos = parsed_line.rfind(' ')

Now that we have the position of the final space, we can use that number in some string slicing:

new_text_line = parsed_line[space_pos:] + ", " + parsed_line[:space_pos]

If you're confused as to what this line does, review the earlier section on splicing. We did pretty much the same thing when we rearranged a first name and surname. But we're just chopping off the postcode at the end and putting it at the beginning, along with a comma.

The final line of the function returns the parsed line, stripped of any trailing space:

return new_text_line.strip()

So amend your openfile if statement to this:

if filename:

the_file = open(filename)

for line in the_file.readlines():

text_line = parseline(line)
textArea.insert(tk.END, text_line + '\n')

the_file.close()

And here it is in the coding window:

Python code that open up a file and reads it line by line

Notice that we've moved the textArea.delete line out of the try … except statement and moved it to just after the line that displays the dialog box. This line can't be part of the for loop because your text will get deleted each time round the loop. Move your textArea.delete line to match ours.

You'll have a red line under parseline, though. Get rid of it by adding this function:

def parseline(the_line):

parsed_line = the_line.strip()
space_pos = parsed_line.rfind(' ')
new_text_line = parsed_line[space_pos:] + ", " + parsed_line[:space_pos]
return new_text_line.strip()

This is your coding window now:

Adding a Python function to parse a line of text

Give it a try. Run your program. Open up your postcodes file and you should find that the text area displays the lines in our preferred order:

An example of text file lines that have been parsed

Reading text files line by line can be a very useful technique to learn. It comes in useful if you need to rearrange your lines or add something to them.

In the next lesson, you'll learn how to save text files in Python.

Saving text files >