Warm tip: This article is reproduced from serverfault.com, please click

How do I hash specific columns from a csv file?

发布于 2019-04-20 16:48:55

I'm trying to hash column 2 and 8 but I ended up hashing the entire file. What's the issue with my code?

import csv
import hashlib


with open('UserInfo.csv') as csvfile:

    with open('UserInfo_Hashed.csv', 'w') as newfile:

        reader = csv.DictReader(csvfile)

        for r in reader:

            hashing = hashlib.sha256((r['Password']).encode('utf-8')).hexdigest()

            newfile.write(hashing + '\n')

enter image description here

enter image description here

Questioner
user11385776
Viewed
0
Khalid Ali 2019-04-21 03:11:24

Since your code showing your attempt to hash the Password column only, the following code just does the hashing for the Password column.

import csv
import hashlib

with open('UserInfo.csv') as csvfile:

    with open('UserInfo_Hashed.csv', 'w') as newfile:

        reader = csv.DictReader(csvfile)

        for i, r in enumerate(reader):
            #  writing csv headers
            if i is 0:
                newfile.write(','.join(r) + '\n')

            # hashing the 'Password' column
            r['Password'] = hashlib.sha256((r['Password']).encode('utf-8')).hexdigest()

            # writing the new row to the file with hashed 'Password'
            newfile.write(','.join(r.values()) + '\n')

The issue with your code is with this line newfile.write(hashing + '\n'), as this writes only the hashed password to the file (without the other columns). Also you didn't write the CSV header to the new file.


I strongly suggest using Pandas:

import pandas as pd
import hashlib

# reading CSV input
df = pd.read_csv('UserInfo.csv')

# hashing the 'Password' column
df['Password'] = df['Password'].apply(lambda x: \
        hashlib.sha256(x.encode('utf-8')).hexdigest())

# writing the new CSV output
df.to_csv('UserInfo_Hashed.csv', index=False)