Python Refresher

File I/O

When gathering data for Machine Learning, you will often read sensor data via serial and save it to a file on your hard drive. Later, you will read that file back into your ML training script.


Opening Files

To work with a file in Python, you use the built-in open() function. You must specify a Mode:

  • 'r' - Read (default). Opens the file for reading.
  • 'w' - Write. Opens the file for writing (creates it if it doesn't exist, erases it if it does).
  • 'a' - Append. Opens the file for writing, but adds to the end instead of erasing it.

The with statement (Best Practice)

When you open a file, you lock it in the operating system. You must remember to call .close() when you are done. If your script crashes before .close() is called, the file might get corrupted.

The with statement handles opening and closing automatically, even if the program crashes.

python
1# 1. Writing to a file (Erases existing data!)
2with open('sensor_log.txt', 'w') as file:
3 file.write("Temperature Log\n")
4 file.write("---------------\n")
5
6# 2. Appending to a file (Adds to the end)
7with open('sensor_log.txt', 'a') as file:
8 file.write("24.5\n")
9 file.write("25.1\n")
10
11# 3. Reading from a file
12with open('sensor_log.txt', 'r') as file:
13 contents = file.read()
14 print("File Contents:")
15 print(contents)

Working with CSV Data

In Data Science and Edge AI, data is almost always stored in CSV (Comma Separated Values) format. Python has a built-in csv module that makes reading and writing them incredibly easy.

Writing a CSV

python
1import csv
2
3# Example data: [Timestamp, Temp, Humidity]
4data = [
5 ["10:00:00", 25.4, 60.1],
6 ["10:05:00", 25.6, 59.8],
7 ["10:10:00", 26.0, 58.5]
8]
9
10# Open with newline='' to prevent double-spacing on Windows
11with open('environment_data.csv', 'w', newline='') as file:
12 writer = csv.writer(file)
13
14 # Write the header row
15 writer.writerow(["Time", "Temperature", "Humidity"])
16
17 # Write the data rows
18 writer.writerows(data)
19
20print("Data saved to environment_data.csv!")

Reading a CSV

python
1import csv
2
3with open('environment_data.csv', 'r') as file:
4 reader = csv.reader(file)
5
6 # The reader is an iterable object
7 for row in reader:
8 # Each row is returned as a Python List of strings
9 print(row)
10
11# Output:
12# ['Time', 'Temperature', 'Humidity']
13# ['10:00:00', '25.4', '60.1']
14# ['10:05:00', '25.6', '59.8']
15# ...

Once your CSV files get massive (megabytes or gigabytes of accelerometer data), the built-in csv module becomes too slow. That is when you switch to using NumPy and Pandas.

Previous
Functions & Modules