Python Refresher
File I/O
When gathering data for Machine Learning, you will often read sensor data via serial and save it to a file on your hard drive. Later, you will read that file back into your ML training script.
Opening Files
To work with a file in Python, you use the built-in open() function. You must specify a Mode:
'r'- Read (default). Opens the file for reading.'w'- Write. Opens the file for writing (creates it if it doesn't exist, erases it if it does).'a'- Append. Opens the file for writing, but adds to the end instead of erasing it.
The with statement (Best Practice)
When you open a file, you lock it in the operating system. You must remember to call .close() when you are done. If your script crashes before .close() is called, the file might get corrupted.
The with statement handles opening and closing automatically, even if the program crashes.
1# 1. Writing to a file (Erases existing data!)2with open('sensor_log.txt', 'w') as file:3 file.write("Temperature Log\n")4 file.write("---------------\n")56# 2. Appending to a file (Adds to the end)7with open('sensor_log.txt', 'a') as file:8 file.write("24.5\n")9 file.write("25.1\n")1011# 3. Reading from a file12with open('sensor_log.txt', 'r') as file:13 contents = file.read()14 print("File Contents:")15 print(contents)Working with CSV Data
In Data Science and Edge AI, data is almost always stored in CSV (Comma Separated Values) format. Python has a built-in csv module that makes reading and writing them incredibly easy.
Writing a CSV
1import csv23# Example data: [Timestamp, Temp, Humidity]4data = [5 ["10:00:00", 25.4, 60.1],6 ["10:05:00", 25.6, 59.8],7 ["10:10:00", 26.0, 58.5]8]910# Open with newline='' to prevent double-spacing on Windows11with open('environment_data.csv', 'w', newline='') as file:12 writer = csv.writer(file)13 14 # Write the header row15 writer.writerow(["Time", "Temperature", "Humidity"])16 17 # Write the data rows18 writer.writerows(data)1920print("Data saved to environment_data.csv!")Reading a CSV
1import csv23with open('environment_data.csv', 'r') as file:4 reader = csv.reader(file)5 6 # The reader is an iterable object7 for row in reader:8 # Each row is returned as a Python List of strings9 print(row)1011# Output:12# ['Time', 'Temperature', 'Humidity']13# ['10:00:00', '25.4', '60.1']14# ['10:05:00', '25.6', '59.8']15# ...Once your CSV files get massive (megabytes or gigabytes of accelerometer data), the built-in csv module becomes too slow. That is when you switch to using NumPy and Pandas.

