Chapter 272 Chen Cheng's Choice_3

import os

import re

import pandas as pd

...

date_pattern = r"\d{4}-\d{2}-\d{2}"# Date format

amount_pattern = r"\d+(\.\d+)?"# Amount format

location_pattern = r"阿苏港"# Location keyword

# Process text files

for filename in os.listdir(data_folder):

if filename.endswith(".txt"):

filepath = os.path.join(data_folder, filename)

try:

with open(filepath,'r', encoding='utf-8') as f:

...

...

Chen Cheng wrote this piece of code that first defined regular expressions to match dates, amounts, and key locations, and then iterated over all Excel, CSV, and text files on the hard drive, pulled table data using the Pandas library, and used regular expressions to extract matching information.

All the extracted data were stored in a list, eventually converted into a DataFrame and saved to a CSV file, for later analysis.

Mei Xiang stood by, stunned, witnessing the fully engrossed Chen Cheng, as if she was watching a hacker code something remarkable.