Code 401 Class 03 Reading Notes
Read & Write Files in Python
What Is a File?
A file is a contiguous set of bytes used to store data. The data is organized in a specific format and can be anything as simple as a text file or as complicated as a program executable. All these byte files are just binary 1’s and 0’s for easier processing by the computer.
Three main parts for the modern file system.
- Header: metadata about the contents of the file (file name, size, type)
- Data: contents of the file as written by the creator or editor
- End of File (EOF): special character that indicates the end of the file.
File Paths
The file path is a string that represents the location of a file.
- Folder Path: the file folder location on the file system where subsequent folders are separated by a forward slash / (Unix) or a backslash \ (Windows)
- File Name: the actual name of the file
- Extension: the end of the file path pre-pended with a period(.)used to indicate the file type.
/
│
├── path/ ← Referencing this parent folder
| │
| ├── to/ ← Current working directory (cwd)
| │ └── cats.gif
| │
| └── dog_breeds.txt ← Accessing this file
|
└── animals.csv
The double (..) can be chained together to traverse multiple directories above the current directory. For example, to access animals.csv from the to folder, you would uses ../../animals.csv
.
Line Endings
The line ending has its roots from back in the Morse Code era, when a specific pro-sign was used to communicate the end of a transmission or the end of a line.
This was carried over by the International Organization for Standardization(ISO) and the American Standards Association(ASA).
Windows view new lines like the below:
Pug\r\n
Jack Russell Terrier\r\n
English Springer Spaniel\r\n
German Shepherd\r\n
While Unix will view it as the below:
Pug\r
\n
Jack Russell Terrier\r
\n
English Springer Spaniel\r
\n
German Shepherd\r
\n
This can make iterating over each line problematic, and we may need to account for situations like this.
Character Encodings
The two most common encodings are the ASCII and UNICODE formats. ASCII can only store 128 characters, while Unicode can contain up to 1,114,112 characters.
Opening and Closing a File in Python
Invoke the open()
built-in function when you want to open a file when you want to work on it.
Remember to close the file after you are doing using it. Most cases, the file will close on it’s own, but this may cause unusual behavior.
See below for an example to open and close a file.
reader = open('dog_breeds.txt')
try:
# Further file processing goes here
finally:
reader.close()
The with statement automatically takes care of closing the file once it leaves the with block, even in cases of error.
with open('dogs_breeds.txt','r') as reader:
# Further file processing goes here
The ‘r’ is a default, read-ony mode as a text file.
Text File Types
open('abc.txt')
open('abc.txt','r')
open('abc.txt','w')
Buffered Binary File Types
A buffered binary file type is used for reading and writing binary files. Here are some examples of how these files are opened:
open('abc.txt','rb')
open('abc.txt','wb')
Raw File Types
A raw file type is:
“generally used as a low-level building-block for binary and text streams” python library
These are not typically used.
Reading and Writing Opened Files
- .read(size=-1) This reads from the file based on the number of size bytes. If no argument is passed or None or -1 passed. then the entire file is read.
- .readline(size=-1) This reads at most size number of characters from the line. This continues to the end of the line and then wraps back around. If no argument is passed or None or -1 is passed, then the entire line (or rest of the line) is read.
- .readlines() This reads the remaining lines from the file object and returns them as a list.
with open('dog_breads.txt','r') as reader:
print(reader.read())
Pug
Jack Russell Terrier
English Springer Spaniel
German Shepherd
Staffordshire Bull Terrier
Cavalier King Charles Spaniel
Golden Retriever
West Highland White Terrier
Boxer
Border Terrier
with open('dog_breeds.txt', 'r') as reader:
print(reader.readline(5))
print(reader.readline(5))
print(reader.readline(5))
print(reader.readline(5))
print(reader.readline(5))
Pug
Jack
Russe
ll Te
rrier
Iterating Over Each Line in the File
with open('dog_breeds.txt','r') as reader:
line = reader.readline()
while line != '':
print(line, end='')
line = reader.readline()
with open('dog_breeds.txt','r') as reader:
for line in reader.readlines():
print(line,end='')
with open('dog_breeds.txt','r') as reader:
for line in reader:
print(line, end='')
// All above will render the below list, separately
Pug
Jack Russell Terrier
English Springer Spaniel
German Shepherd
Staffordshire Bull Terrier
Cavalier King Charles Spaniel
Golden Retriever
West Highland White Terrier
Boxer
Border Terrier
Working With Bytes
Value | Interpretation |
---|---|
0x89 | A “magic” number to indicate that this is the start of a PNG |
0x50 0x4E 0x47 | PNG in ASCII |
0x0D 0x0A | A DOS style line ending \r\n |
0x1A | A DOS style EOF character |
0x0A | A Unix style line ending \n |
Appending to a File
You can append to a file or start writing at the end of a populated file bu adding the ‘a’ character for the mode argument:
with open('dog_breeds.txt','a') as a_writer:
a_writer.write('\nBeagle')
with open('dog_breeds.txt','r') as reader:
print(reader.read())
Pug
Jack Russell Terrier
English Springer Spaniel
German Shepherd
Staffordshire Bull Terrier
Cavalier King Charles Spaniel
Golden Retriever
West Highland White Terrier
Boxer
Border Terrier
Beagle
Exceptions in Python
Exceptions versus Syntax Errors
Syntax Errors occur when the parser detects an incorrect statement
Example:
print( 0 / 0))
File "<stdin>", line 1
print( 0 / 0))
^
SyntaxError: invalid syntax
There are one too many brackets, remove it and you get the below
print(0 / 0) Traceback (most recent call last): File "<stdin>", line 1, in <module> ZeroDivisionError: integer division or modulo by zero
This time, we ran into an exception error. When this error occurs, syntactically correct Python code results in an error.The last line of the message indicated what type of exception error you ran into.
Raising an Exception
We can use raise to throw an exception if a condition occurs. The statement can be complemented with a custom exception.
Example: x = 10 if x > 5: raise Exception('x should not exceed 5. The value of x was: {}'.format(x)) Result: Traceback (most recent call last): File "<input>", line 4 in <module> Exception: x should not exceed 5. The value of x was 10
The program comes to a full stop and displays the exception to screen.
The AssertionError Exception
Making an assertion in Pything will crash once a condition is met. If the condition turns out to, the program continues. If the condition is false, you can have the program throw an AssertionError exception
Pass Example: import sys assert ('linux' in sys.platform), 'This Code runs on Linux only.' Fail Example: Traceback (most recent call last): File "<input>', line 2, in <module> AssertionError: This code runs on Linux only.
The try and except Block: Handling Exceptions
The code that follows the except statement is the program’s response to any exception in the preceding try clause.
Example: def linux_interaction(): assert ('linux' in sys.plateform), 'Function can only run on Linux systems.' print('Doing something.') Pass Example while on Windows: try: linux_interaction() except: pass Fail Example while on Windows: try: linux_interaction() except: print('Linux function was not executed')
The else Clause
In Python, using the else statement, you can instruct a program to execute a certain block of code only in the basence of exceptions.
Example: try: linux_interaction() except AssertionError as error: print(error) else: print('Executing the else clause.') Output: Doing something. Executing the else clause.
Cleaning Up After Using finally
Cleaning up after executing your code.
Running the below code on a Windows machine, the below output will occur.
Example: try: linux_interaction except AssertionError as error: print(error) else: try: with open('file.log') as file: read_data = file.read() except FileNotFoundError as fnf_error: print(fnf_error) finally: print('Cleaning up, irrespective of any exceptions.') Output: Function can only run on Linux systems. Cleaning up, irrespective of any exceptions.
- raise allows you to throw an exception at any time.
- assert enables you to verify if a certain condition is met and throw an exception if it isn’t.
- In the try clause, all statements are executed until an exception is encountered.
- except is used to catch and handle the exception(s) that are encountered in the try clause.
- else lets you code sections that should run only when no exceptions are encountered in the try clause.
- finally enables you to execute sections of code that should always run, with or without any previously encountered exceptions.